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Preface to the Fifth Edition 


In the Preface to the first edition of this book, published over forty years ago in 1982, we 
wrote that our aim was to help the reader to acquire a ‘reasonable understanding of gauge 
theories that are being tested by contemporary experiments in high-energy physics’; and 
we stressed that our approach was intended to be both practical and accessible. 

That first edition ran to just 341 pages. Subsequent editions saw successive enlarge- 
ments, motivated both by experimental advances and by an ongoing need to provide an 
accessible basis for their theoretical understanding. Thus, shortly after the appearance of 
the first edition, a series of major discoveries at the CERN pp collider confirmed the exis- 
tence of the W and Z bosons, with properties predicted by the Glashow-Salam-Weinberg 
electroweak gauge theory; and also provided further support for quantum chromodynamics, 
or QCD. The second edition (1989) included an extended discussion of these developments. 
It also contained a significant new theoretical component — an elementary introduction to 
quantum field theory (qft). This was motivated partly by the undeniable fact that qft lies 
at the heart of the twentieth-century answer to all those ancient questions concerning the 
nature of matter and force, and is the language of the Standard Model; and partly with 
an eye on the increasing precision of experiments, which require the inclusion of radiative 
corrections for their theoretical interpretation. 

Indeed, experiments at LEP and other laboratories were soon precise enough to test the 
Standard Model beyond the first order in perturbation theory (‘tree level’), being sensitive 
to higher-order effects (‘loops’). In response, we decided it was appropriate to include the 
basics of ‘one-loop physics’. Together with the existing material on relativistic quantum 
mechanics, and on the Abelian gauge theory QED, this comprised volume 1 (2003) of our 
two-volume third edition. The non-Abelian gauge theories of the Standard Model, QCD 
and the electroweak theory, formed the core of volume 2 (2004). The progress of research 
on QCD, both theoretical and experimental, required new chapters on lattice quantum field 
theory, and on the renormalization group. The discussion of the central topic of spontaneous 
symmetry breaking was extended, in particular so as to include chiral symmetry breaking. 
In these theory additions, our aim was to give a self-contained introduction, with enough 
content to provide readers with access to more advanced treatments. 

New experimental results dictated the principal new additions in volume 2 of the fourth 
edition (2012) — namely, in the areas of CP violation and neutrino oscillations. We were 
able to conclude volume 2 with an introductory discussion of the historic 2012 discovery of a 
boson which seemed very likely to be the Higgs boson of the Standard Model. In volume 1 of 
the fourth edition, we added (perhaps belatedly) a new chapter on Lorentz transformations 
and discrete symmetries. We also introduced Weyl and Majorana fermions. 

Now, more than a decade after the fourth edition, the Standard Model has been subjected 
to ever more precise experimental tests, which it has so far withstood successfully. While no 
major new discoveries have been reported, the wealth of new data strongly suggested the 
need for a new edition. We thank our editor, Rebecca Hodges-Davies, and the reviewers for 
enthusiastically supporting our proposal. The most substantial updates naturally appear 
in volume 2, as will be detailed in the Preface to that volume. Volume 1, meanwhile, sees 


xiii 
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little change, apart from minor improvements in the text, and some experimental updates, 
including the latest (2023) news about the muon g — 2 situation. 
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1 
The Particles and Forces of the Standard Model 


1.1 Introduction: the Standard Model 


The traditional goal of particle physics has been to identify what appear to be structureless 
units of matter and to understand the nature of the forces acting between them; all other 
entities are then to be successively constructed as composites of these elementary building 
blocks. The enterprise has a two-fold aspect: matter on the one hand, forces on the other. The 
expectation is that the smallest units of matter should interact in the simplest way, or that 
there is a deep connection between the basic units of matter and the basic forces. The joint 
matter/force nature of the enquiry is perfectly illustrated by Thomson’s discovery of the 
electron and Maxwell’s theory of the electromagnetic field, which together mark the birth of 
modern particle physics. The electron was recognized both as the ‘particle of electricity’—or 
as we might now say, as an elementary source of the electromagnetic field, with its motion 
constituting an electromagnetic current—and also as an important constituent of matter. 
In retrospect, the story of particle physics over the subsequent one hundred years or so has 
consisted of the discovery and study of two new (non-electromagnetic) forces—the weak 
and the strong forces—and in the search for ‘electron-figures’ to serve both as constituents 
of the new layers of matter which were uncovered (first nuclei and then hadrons) and also 
as sources of the new force fields. In the last quarter of the twentieth century, this effort 
culminated in decisive progress: the identification of a collection of matter units which are 
indeed analogous to the electron, and the highly convincing experimental verification of 
theories of the associated strong and weak force fields, which incorporate and generalize in 
a beautiful way the original electron/electromagnetic field relationship. These theories are 
collectively called ‘the Standard Model’ (or SM for short), to which this book is intended 
as an elementary introduction. 

In brief, the picture is as follows. The matter units are fermions, with spin-4 (in units of 
h). They are of two types, leptons and quarks. Both are structureless at the smallest distances 
currently probed by the highest-energy accelerators. The leptons are generalizations of the 
electron, the term denoting particles which, if charged, interact both electromagnetically and 
weakly, and if neutral, only weakly. By contrast, the quarks—which are the constituents 
of hadrons, and thence of nuclei—interact via all three interactions, strong, electromag- 
netic and weak. The weak and electromagnetic interactions of both quarks and leptons 
are described in a (partially) unified way by the electroweak theory of Glashow, Salam, 
and Weinberg (GSW), which is a generalization of quantum electrodynamics or QED; the 
strong interactions of quarks are described by quantum chromodynamics or QCD, which is 
also analogous to QED. The similarity with QED lies in the fact that all three interactions 
are types of gauge theories, though realized in different ways. In the first volume of this 
book, we will get as far as QED; QCD and the electroweak theory are treated in volume 2. 

The reader will have noticed that the most venerable force of all—gravity—is absent 
from our story. In practical terms this is quite reasonable, since its effect is many orders 
of magnitude smaller than even the weak force, at least until the interparticle separation 
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reaches distances far smaller than those we shall be discussing. Conceptually also, gravity 
still seems to be somewhat distinct from the other forces which, as we have already indi- 
cated, are encouragingly similar. There are no particular fermionic sources carrying ‘gravity 
charges’: it seems that all matter gravitates. This of course was a motivation for Einstein’s 
geometrical approach to gravity. Despite the lingering promise of string theory (Green et 
al. 1987, Polchinski 1998, Zwiebach 2004), it is fair to say that the vision of the unification 
of all the forces, which possessed Einstein, is still some way from realization. Gravitational 
interactions are not part of the SM. 

This book is not intended as a completely self-contained textbook on particle physics, 
which would survey the broad range of observed phenomena and outline the main steps by 
which the picture described here has come to be accepted. For this we must refer the reader 
to other sources (e.g. Perkins 2000, Bettini 2008, Thomson 2013). We proceed with a brief 
review of the matter (fermionic) content of the SM. 


1.2 The fermions of the Standard Model 
1.2.1 Leptons 


Forty years after Thomson’s discovery of the electron, the first member of another generation 
of leptons (as it turned out)—the muon—was found independently by Street and Stevenson 
(1937) and by Anderson and Neddermeyer (1937). Following the convention for the electron, 
the u” is the particle and the u™ the anti-particle. At first, the muon was identified with 
the particle postulated by Yukawa only two years earlier (1935) as the field quantum of the 
‘strong nuclear force field’, the exchange of which between two nucleons would account for 
their interaction (see section 1.3.2). In particular, its mass (105.7 MeV) was nicely within the 
range predicted by Yukawa. However, experiments by Conversi et al. (1947) established that 
the muon could not be Yukawa’s quantum since it did not interact strongly; it was therefore 
a lepton. The u~ seemed to behave in exactly the same way as the electron, interacting only 
electromagnetically and weakly, with interaction strengths identical to those of an electron. 

In 1975 Perl et al. (1975) discovered yet another ‘replicant’ electron, the 7~ with a 
mass of 1.78 GeV. Once again, the weak and electromagnetic interactions of the 7~ (T+) 
appeared to be identical to those of the e~ (e*). 

The SM assumes that the three different charged leptons have identical electroweak 
gauge interactions. Measurements of a wide range of decays have shown that the results are 
consistent with this assumption of lepton universality. However, there have been indications 
more recently that this assumption may not be exact. One concerns the decays of b quarks, 
where the LHCb detector at CERN reported a difference, at the level of 3.1 o, between 
certain branching ratios to muon pairs and to electron pairs (Aaij et al. 2022). Another 
possible universality violation involves the anomalous magnetic moment of the muon, for 
which a measurement at FNAL (B. Abi et al. 2021) finds a discrepancy at the level of 4.2 
o with the prediction of the SM. We shall discuss the second of these (the muon g — 2 
anomaly) further in section 11.7. 

In contrast, the Yukawa interactions between the Higgs boson and the fermions (see 
section 1.4.1) are not governed by gauge symmetry and are proportional to the fermion 
masses, which of course are different (indeed strikingly so). This part of the SM therefore 
violates lepton universality. 

At this stage one might well wonder whether we are faced with a ‘lepton spectroscopy’, 
of which the e~, u` and 7~ are but the first three states. Yet this seems not to be the 
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correct interpretation. First, no other such states have (so far) been seen. Second, all these 
leptons have the same spin (3), which is certainly quite unlike any conventional excitation 
spectrum. And third, no y-transitions are observed to occur between the states, though this 
would normally be expected. For example, the branching fraction for the process 


pp ae +7 (not observed) (1.1) 


is currently quoted as less than 4.2 x 107! at the 90% confidence level (Workman et al. 
2022). Similarly, there are (much less stringent) limits on 77 > u~ + y and TT > e7 +7. 

If the e~ and u` states in (1.1) were, in fact, the ground and first excited states of some 
composite system, the decay process (1.1) would be expected to occur as an electromagnetic 
transition, with a relatively high probability because of the large energy release. Yet the 
experimental upper limit on the rate is very tiny. In the absence of any mechanism to explain 
this, one systematizes the situation, empirically, by postulating the existence of a selection 
rule forbidding the decay (1.1). In taking this step, it is important to realize that ‘absolute 
forbidden-ness’ can never be established experimentally: all that can be done is to place a 
(very small) upper limit on the branching fraction to the ‘forbidden’ channel, as here. The 
possibility will always remain open that future, more sensitive, experiments will reveal that 
some processes, assumed to be forbidden, are in fact simply extremely rare. 

Of course, such a proposed selection rule would have no physical content if it only applied 
to the one process (1.1), but it turns out to be generally true, applying not only to the 
electromagnetic interaction of the charged leptons, but to their weak interactions also. The 
upshot is that we can consistently account for observations (and non-observations) involving 
e’s, ws and 7’s by assigning to each a new additive quantum number (called ‘lepton flavour’) 
which is assumed to be conserved. Thus we have electron flavour Le such that L.(e7) = 1 
and L.(et) = —1; muon flavour L, such that L (u7) = 1 and L (ut) = —1; and tau 
flavour L, such that L-(7~) = 1 and L,(r+) = —1. Each is postulated to be conserved in 
all leptonic processes. So (1.1) is then forbidden: the left-hand side has Le = 0 and L, = 1, 
while the right-hand side has Lẹ = 1 and L, = 0. 

The electromagnetic interactions of the mu and the tau leptons are the same as for the 
electron. In weak interactions, each charged lepton (e, u, T) is accompanied by its ‘own’ neu- 
tral partner, a neutrino. The one emitted with the e~ in -decay was originally introduced 
by Pauli in 1930, as a ‘desperate remedy’ to save the conservation laws of four-momentum 
and angular momentum. In the SM, the three neutrinos are assigned lepton flavour quan- 
tum numbers in such a way as to conserve each lepton flavour separately. Thus we assign 
L. = —1, L, = 0, and L, = 0 to the neutrino emitted in neutron 6-decay 


n+>p+e +2, (1.2) 


since Le = 0 in the initial state and Le(e7) = +1; so the neutrino in (1.2) is an anti-neutrino 
‘of electron type’ (or ‘of electron flavour’). The physical reality of the anti-neutrinos emitted 
in nuclear -decay was established by Reines and collaborators in 1956 (Cowan et al. 1956), 
by observing that the anti-neutrinos from a nuclear reactor produced positrons via the 
inverse -process 

De +p— >n+er. (1.3) 


The neutrino partnering the ~~ appears in the decay of the m~: 
T > +d, (1.4) 


where the D, is an anti-neutrino of muon type (L (0) = —1, Le(¥,,) = 0 = L-(H,,)). How 
do we know that ,, and Pe are not the same? An important experiment by Danby et al. 
(1962) provided evidence that they are not. They found that the neutrinos accompanying 
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muons from 7-decay always produced muons on interacting with matter, never electrons. 
Thus, for example, the lepton flavour conserving reaction 


Pptpouttn (1.5) 
was observed, but the lepton flavour violating reaction 
Dı +p>e"+n (not observed) (1.6) 


was not. As with (1.1), ‘non-observation’ of course means, in practice, an upper limit on 
the cross section. Both types of neutrino occur in the 6-decay of the muon itself: 


jb > Vu te +D, (1.7) 


in which L, = 1 is initially carried by the u~ and finally by the v„, and the Le’s of the e7 
and De cancel each other out. 

In the same way, the v, is associated with the T~, and we have arrived at three genera- 
tions of charged and neutral lepton doublets: 


(ve, e7) (vist) and (vr, T) (1.8) 


together with their anti-particles. 
We should at this point note that another type of weak interaction is known, in which— 


for example—the D, in (1.5) scatters elastically from the proton, instead of changing into a 


u 


Da +P > Dp +p. (1.9) 
This is an example of what is called a ‘neutral current’ process, (1.5) being a ‘charged 
current’ one. In terms of the Yukawa-like exchange mechanism for particle interactions, to 
be described in the next section, (1.5) proceeds via the exchange of charged quanta (W*), 
while in (1.9) a neutral quantum (Z?) is exchanged. 

As well as their flavour, one other property of neutrinos is of great interest, namely their 
mass. As originally postulated by Pauli, the neutrino emitted in -decay had to have very 
small mass because the maximum energy carried off by the e~ in (1.2) was closely equal 
to the difference in rest energies of the neutron and proton. It was subsequently widely 
assumed (perhaps largely for simplicity) that all neutrinos were strictly massless, and it is 
fair to say that the original SM made this assumption. Yet there is, in fact, no convincing 
reason for this (as there is for the masslessness of the photon—see chapter 6), and there 
is now clear evidence that neutrinos do indeed have very small, but non-zero, masses. It 
turns out that the question of neutrino masslessness is directly connected to another one: 
whether neutrino flavour is, in fact, conserved. If neutrinos are massless, as in the original 
SM, neutrinos of different flavour cannot ‘mix’, in the sense of quantum-mechanical states; 
but mixing can occur if neutrinos have mass. The phenomenon of neutrino flavour mixing 
(or “neutrino oscillations”) is now well established, and will be discussed in chapter 21 
in volume 2. In this book we shall simply regard non-zero neutrino masses as part of the 
(updated) Standard Model. The SM leptons are listed in Table 1.1, along with some relevant 
properties. The mass values are taken from Workman et al. 2022. 

We now turn to the other fermions in the SM. 


1.2.2 Quarks 


Quarks are the constituents of hadrons in which they are bound by the strong QCD forces. 


Hadrons with spins 3, 3, 3, ... (i.e. fermions) are baryons, those with spins 0, 1, 2, ... (i.e. 
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TABLE 1.1 
Properties of SM leptons 
Generation Particle Mass (MeV) Q/e L. L, L» 


i] Ve <lixio* 0 Ii 0 0 
e7 0.511 -1 1 0 0 
2 Vy < 0.19 0 0 1 0 
uT 105.658 -1 0 1 0 
3 Uy < 18.2 0 0 0 1 
T7 1777 -1 0 0 1 


bosons) are mesons. Examples of baryons are nucleons (the neutron n and the proton p), and 
hyperons such as A° and the E and E states. Evidence for the composite nature of hadrons 
accumulated during the 1960s and 1970s. Elastic scattering of electrons from protons by 
Hofstadter and co-workers (Hofstadter 1963) showed that the proton was not pointlike, but 
had an approximately exponential distribution of charge with a root mean square radius of 
about 0.8 fm. Much careful experimentation in the field of baryon and meson spectroscopy 
revealed sequences of excited states, strongly reminiscent of those well-known in atomic and 
nuclear physics. 

The conclusion would now seem irresistible that such spectra should be interpreted as 
the energy levels of systems of bound constituents. A specific proposal along these lines was 
made in 1964 by Gell-Mann (1964) and Zweig (1964). Though based on somewhat different 
(and much more fragmentary) evidence, their suggestion has turned out to be essentially 
correct. They proposed that baryons contain three spin-4 constituents called quarks (by 
Gell-Mann), while mesons are quark-antiquark systems. One immediate consequence is that 
quarks have fractional electromagnetic charge. For example, the proton has two quarks of 
charge +3, called ‘up’ (u) quarks, and one quark of charge —4, the ‘down’ (d) quark. The 
neutron has the combination ddu, while the 7* has one u and one anti-d (d ) and so on. 

Quite simple quantum-mechanical bound state quark models, based on these ideas, were 
remarkably successful in accounting for the observed hadronic spectra. Nevertheless, many 
physicists, in the 1960s and early 1970s, continued to regard quarks more as useful devices 
for systematizing a mass of complicated data than as genuine items of physical reality. One 
reason for this scepticism must now be confronted, for it constitutes a major new twist in 
the story of the structure of matter. 

Gell-Mann ended his 1964 paper with the remark: ‘A search for stable quarks of 
charge—$ or +3 and/or stable di-quarks of charge -2 or +3 or +4 at the highest energy 
accelerators would help to reassure us of the non-existence of real quarks’. Indeed, with one 
possible exception (La Rue et al. 1977, 1981), this ‘reassurance’ has been handsomely pro- 
vided! Unlike the constituents of atoms and nuclei, quarks have not been observed as stable 
isolated particles. When hadrons of the highest energies currently available are smashed 
into each other, what is observed downstream is only lots more hadrons, not fractionally 
charged quarks. The explanation for this novel behaviour of quarks is now believed to lie in 
the nature of the interquark force (QCD). We shall briefly discuss this force in section 1.3.6, 
and treat it in detail in volume 2. The consensus at present is that QCD does imply the 
confinement of quarks—that is, they do not exist as isolated single particles!, only as groups 
confined to hadronic volumes. 

When Gell-Mann and Zweig made their proposal, three types of quark were enough 
to account for the observed hadrons: in addition to the u and d quarks, the ‘strange’ 


1 With the (fleeting) exception of the t quark, as we shall see in a moment. 
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quark s was needed to describe the known strange particles such as the hyperon A? (uds), 
and the strange mesons like K? (ds). In 1964, Bjorken and Glashow (1964) discussed the 
possible existence of a fourth quark on the basis of quark—lepton symmetry, but a strong 
theoretical argument for the existence of the c-quark, within the framework of gauge theories 
of electroweak interactions, was given by Glashow, Iliopoulos and Maiani (1970), as we shall 
discuss in volume 2. They estimated that the c-quark mass should lie in the range 3—4 GeV. 
Subsequently, Gaillard and Lee (1974) performed a full (one-loop) calculation in the then 
newly-developed renormalizable electroweak theory, and predicted me œ~ 1.5 GeV. The 
prediction was spectacularly confirmed in November of the same year with the discovery 
(Aubert et al. 1974, Augustin et al. 1974) of the J/w system, which was soon identified as a cé 
composite (and dubbed “charmonium” ), with a mass in the vicinity of 3 GeV. Subsequently, 
mesons such as D°(ct) and D*(cd) carrying the c quark were identified (Goldhaber et al. 
1976, Peruzzi et al. 1976), consolidating this identification. 

The second generation of quarks was completed in 1974, with the two quark doublets 
(u, d) and (c, s) in parallel with the lepton doublets (v.,e~) and (vp, u~). But even before 
the discovery of the c quark, the possibility that a completely new third-generation quark 
doublet might exist was raised in a remarkable paper by Kobayashi and Maskawa (1973). 
Their analysis focused on the problem of incorporating the known violation of CP symmetry 
(the product? of particle-antiparticle conjugation C and parity P) into the quark sector of 
the renormalizable electroweak theory. CP-violation in the decays of neutral K-mesons had 
been discovered by Christenson et al. (1964), and Kobayashi and Maskawa pointed out 
that it was very difficult to construct a plausible model of CP-violation in weak transitions 
of quarks with only two generations. They suggested, however, that CP-violation could 
be naturally accommodated by extending the theory to three generations of quarks. Their 
description of CP-violation thus entailed the very bold prediction of two entirely new and 
undiscovered quarks, the (t, b) doublet, where t has charge 2 and b has charge -—4. 

In 1975, with the discovery of the 7~ mentioned earlier, there was already evidence 
for a third generation of leptons. The discovery of the b quark in 1977 resulted from the 
observation of massive mesonic states generally known as Y (‘upsilon’) (Herb et al. 1977, 
Innes et al. 1977), which were identified as bb composites. Subsequently, b-carrying mesons 
were found. Finally, firm evidence for the expected t quark was obtained by the CDF and DO 
collaborations at Fermilab in 1995 (Abe et al. 1995, Abachi et al. 1995). The full complement 
of three generations of quark doublets is then 


(ud)  (c,s) and  (t,b) (1.10) 


together with their antiparticles, in parallel with the three generations of lepton doublets 
(1.8). 

One particular feature of the t quark requires comment. Its mass is so large that, although 
it decays weakly, the energy release is so great that its lifetime (T ~ 4x 10~?° s) is some two 
orders of magnitude shorter than typical strong interaction timescales; this means that it 
decays before any t-carrying hadrons can be formed. Another way of seeing this is to consider 
the distance a t quark can travel when produced, which will be of order cr ~ 10718 m, at 
which distances the QCD interactions will be relatively weak (see section 1.3.6). So when a 
t quark is produced (in a p-p collision, for example), it decays as a free (unbound) particle. 
Its mass can be determined from a kinematic analysis of the decay products, but there are 
subtle issues relating to the definition of the top quark mass when interpreting the results 
of high precision data, as we shall discuss in section 22.7.3. 

We must now discuss the quantum numbers carried by quarks. First of all, each quark 
listed in (1.10) comes in three varieties, distinguished by a quantum number called ‘colour’. 


?We shall discuss these symmetries in chapter 4. 
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It is precisely this quantum number that underlies the dynamics of QCD (see section 1.3.6). 
Colour, in fact, is a kind of generalized charge, for the strong QCD interactions. We shall 
denote the three colours of a quark by ‘red’, ‘blue’ and ‘green’. Thus we have the triplet 
(ur, Up, Ug), and similarly for all the other quarks. 

Secondly, quarks carry flavour quantum numbers, like the leptons. In the quark case, 
they are as follows. The two quarks, which are familiar in ordinary matter, ‘u’ and ‘qd’, are 
an isospin doublet (see chapter 12 in volume 2) with T3=+1/2 for ‘u’ and T3 = —1/2 for 
‘d’. The flavour of ‘s’ is strangeness, with the value S = —1. The flavour of ‘c’ is charm, 
with value C = +1, that of ‘b’ is beauty with value B = —1 (we use B to distinguish it 
from baryon number B), and the flavour of ‘t’ is T = +1. The convention is that the sign 
of the flavour number is the same as that of the charge. 

The strong and electromagnetic interactions of quarks are independent of quark flavour, 
and depend only on the electromagnetic charge and the strong charge, respectively. This 
means, in particular, that flavour cannot change in a strong interaction among hadrons— 
that is, flavour is conserved in such interactions. For example, from a zero strangeness initial 
state, the strong interaction can only produce pairs of strange particles, with cancelling 
strangeness. This is the phenomenon of ‘associated production’, known since the early days 
of strange particle physics in the 1950s. Similar rules hold for the other flavours: for example, 
the t quark, once produced, cannot decay to a lighter quark via a strong interaction, since 
this would violate T-conservation. 

In weak interactions, by contrast, quark flavour is generally not conserved. For example, 
in the semi-leptonic decay 

A°(uds) + p(uud) + e7 + De, (1.11) 


an s quark changes into a u quark. The rather complicated flavour structure of weak in- 
teractions, which remains an active field of study, will be reviewed when we come to the 
GSW theory in volume 2. However, one very important, though technical, point must be 
made about the weak interactions of quarks and leptons. It is natural to wonder whether a 
new generation of quarks might appear, unaccompanied by the corresponding leptons—or 
vice versa. Within the framework of the SM interactions, the answer is no. It turns out 
that subtle quantum field theory effects called ‘anomalies’, to be discussed in chapter 18 
of volume 2, would spoil the renormalizability of the weak interactions (see section 1.4.1), 
unless there are equal numbers of quark and lepton generations. . 

We end this section with some comments about the quark masses; the values listed 
in Table 1.2 are those given in Workman et al. (2022). As we have already noted, the 


TABLE 1.2 
Properties of SM quarks. 
Generation Particle Mass Q/e S C B T 
1 Ur Up Ug 216 MeV 2/3 0 0 0 0 


dy dp dg 4.671043 MeV -1/3 0 0 0 0 


2 Cr Cb €g 1.27 +0.02 GeV 2/3 0 1 0 0 
Sr Sb Sg 93.4 t36 MeV -1/3 -1 0 0 0 


3 tr tb tg 172.69 +0.30 GeV 2/3 0 0 0 1 


br bp bg  4.18t603GeV —1/3 0 0 —1 0 
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t quark is the only one whose mass can be directly measured, because it decays as an 
essentially free particle deep inside the hadronic volume. All the others are (it would appear) 
permanently confined inside hadrons. It is therefore not immediately obvious how to define— 
and measure—their masses. In a more familiar bound state problem, such as a nucleus, 
the masses of the constituents are those we measure when they are free of the nuclear 
binding forces—i.e. when they are far apart. For the QCD force, the situation is very 
different. There it turns out that the force is very weak at short distances, a property called 
asymptotic freedom—see section 1.3.6; this important property will be treated in section 
15.3 of volume 2. We may think of the force as very roughly analogous to that of a spring 
joining two constituents. To separate them, energy must be supplied to the system. So when 
the constituents are no longer close, the energy of the system is greater than the sum of 
the short distance (free) quark masses. In potential models (see section 1.3.6), the effect 
is least pronounced for the “heavy” quarks (mq greater than about 1 GeV). For example, 
the ground state of the Y(bb) lies at about 9.46 GeV, which is about 1.5 GeV above the 
value of 2mp as given in Table 1.2. For w(ct) the ground state is at about 3 GeV, somewhat 
greater than 2m,. For the three lightest quarks, and especially for the u and d quarks, the 
position is quite different: for example, the proton (uud) with a mass of 938 MeV is far more 
massive than 2m, + mg. Here the “spring” is responsible for about 300 MeV per quark. 

While this picture is qualitatively useful, it is clearly model-dependent, as would be 
even a more sophisticated quark model. To do the job properly, we have to go to the 
actual QCD Lagrangian, and use it to calculate the hadron masses with the Lagrangian 
masses as input. This can be done through a lattice simulation of the field theory, as will 
be described in chapter 16 of volume 2. Independently, another handle on the Lagrangian 
masses is provided by the fact that the QCD Lagrangian has an extra symmetry (“chiral 
symmetry”) which is exact when the quark masses are zero. This is, in fact, an excellent 
approximation for the u and d quarks, and a fair one for the s quark. The symmetry is, 
however, dynamically (“spontaneously”) broken by QCD, in such a way as to generate 
(in the case Mu = mq = 0) the nucleon mass entirely dynamically, along with a massless 
pion. The small Lagrangian masses can then be treated perturbatively in a procedure called 
“chiral perturbation theory”. These essential features of QCD will be treated in chapter 18 
of volume 2. For the moment, we accept the values in Table 1.2, given in Workman et al. 
(2022), which contains a review of quark masses. 


1.3 Particle Interactions in the Standard Model 
1.3.1 Classical and quantum fields 


In the world of the classical physicist, matter and force were clearly separated. The nature of 
matter was intuitive, based on everyday macroscopic experience; force, however, was more 
problematical. Contact forces between bodies were easy to understand, but forces which 
seemed capable of acting at a distance caused difficulties. 


That gravity should be innate, inherent and essential to matter, so that one body 
can act upon another at a distance, through a vacuum, without the mediation of 
anything else, by and through which action and force may be conveyed from one 
to the other, is to me so great an absurdity, that I believe no man who has in 
philosophical matters a competent faculty of thinking can ever fall into it. (Letter 
from Newton to Bentley) 
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Newton could find no satisfactory mechanism or physical model, for the transmission of 
the gravitational force between two distant bodies; but his dynamical equations provided 
a powerful predictive framework, given the (unexplained) gravitational force law; and this 
eventually satisfied most people. 

The nineteen century saw the precise formulation of the more intricate force laws of 
electromagnetism. Here too the distaste for action-at-a-distance theories led to numerous 
mechanical or fluid mechanical models of the way electromagnetic forces (and light) are 
transmitted. Maxwell made brilliant use of such models as he struggled to give physical 
and mathematical substance to Faraday’s empirical ideas about lines of force. Maxwell’s 
equations were indeed widely regarded as describing the mechanical motion of the ether— 
an amazing medium, composed of vortices, gear wheels, idler wheels and so on. But in his 
1864 paper, the third and final one of the series on lines of force and the electromagnetic 
field, Maxwell himself appeared ready to throw away the mechanical scaffolding and let the 
finished structure of the field equations stand on its own. Later these field equations were 
derived from a Lagrangian (see chapter 7), and many physicists came to agree with Poincaré 
that this ‘generalized mechanics’ was more satisfactory than a multitude of different ether 
models; after all, the same mathematical equations can describe, when suitably interpreted, 
systems of masses, springs and dampers, or of inductors, capacitors and resistors. With this 
step, the concepts of mechanics were enlarged to include a new fundamental entity, the 
electromagnetic field. 

The action-at-a-distance dilemma was solved, since the electromagnetic field permeates 
all of space surrounding charged or magnetic bodies, responds locally to them, and itself 
acts on other distant bodies, propagating the action to them at the speed of light: for 
Maxwell’s theory, besides unifying electricity and magnetism, also predicted the existence 
of electromagnetic waves which should travel with the speed of light, as was confirmed by 
Hertz in 1888. Indeed, light was a form of electromagnetic wave. 

Maxwell published his equations for the dynamics of the electromagnetic field (Maxwell 
1864) some forty years before Einstein’s 1905 paper introducing special relativity. But 
Maxwell’s equations are fully consistent with relativity as they stand (see chapter 2), and 
thus constitute the first relativistic (classical) field theory. The Maxwell Lagrangian lives 
on, as part of QED. 

It seems almost to be implied by the local field concept, and the desire to avoid action 
at a distance, that the fundamental carriers of electricity should themselves be point-like, 
so that the field does not, for example, have to interact with different parts of an electron 
simultaneously. Thus the point-like nature of elementary matter units seems intuitively to 
be tied to the local nature of the force field via which they interact. 

Very soon after the successes of classical field physics, however, another world began 
to make its appearance—the quantum one. First the photoelectric effect and then—much 
later—the Compton effect showed unmistakeably that electromagnetic waves somehow also 
had a particle-like aspect, the photon. At about the same time, the intuitive understanding 
of the nature of matter began to fail as well: supposedly particle-like things, like electrons, 
displayed wave-like properties (interference and diffraction). Thus the conceptual distinction 
between matter and forces, or between particle and field, was no longer so clear. On the one 
hand, electromagnetic forces, treated in terms of fields, now had a particle aspect; and on 
the other hand, particles now had a wave-like or field aspect. ‘Electrons’, writes Feynman 
(1965a) at the beginning of volume 3 of his Lectures on Physics, ‘behave just like light’. 

How can we build a theory of electrons and photons which does justice to all the ‘point- 
like’, ‘local’, ‘wave/particle’ ideas just discussed? Consider the apparently quite simple pro- 
cess of spontaneous decay of an excited atomic state in which a photon is emitted: 


A* + A+¥. (1.12) 
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Ordinary non-relativistic quantum mechanics cannot provide a first-principles account of 
this process, because the degrees of freedom it normally discusses are those of the matter 
units alone—that is, in this example, the electronic degrees of freedom. However, it is clear 
that something has changed radically in the field degrees of freedom. On the left-hand side, 
the matter is in an excited state and the electromagnetic field is somehow not manifest; on 
the right, the matter has made a transition to a lower-energy state and the energy difference 
has gone into creating a quantum of electromagnetic radiation. What is needed here is a 
quantum theory of the electromagnetic field — a quantum field theory. 

Quantum field theory — or qft for short — is the fundamental formal and conceptual 
framework of the SM. An important purpose of this book is to make this core twentieth 
century formalism more generally accessible. In chapter 5 we give a step-by-step introduc- 
tion to qft. We shall see that a free classical field — which has infinitely many degrees of 
freedom — can be thought of as mathematically analogous to a vibrating solid (which 
has merely a very large number). The way this works mathematically is that the Fourier 
components of the field act like independent harmonic oscillators, just like the vibrational 
‘normal modes’ of the solid. When quantum mechanics is applied to this system, the energy 
eigenstates of each oscillator are quantized in the familiar way, as (n, + 1/2)hw, for each 
oscillator of frequency wp: we say that such states contain ‘n, quanta of frequency wp’. The 
state of the entire field is characterized by how many quanta of each frequency are present. 
These ‘excitation quanta’ are the particle aspect of the field. In the ground state there are 
no excitations present — no field quanta — and so that is the vacuum state of the field. 

In the case of the electromagnetic field, these quanta are of course photons (for the 
solid, they are phonons). In the process (1.12), the electromagnetic field was originally in 
its ground (no photon) state, and was raised finally to an excited state by the transfer of 
energy from the electronic degrees of freedom. The final excited field state is defined by the 
presence of one quantum (photon) of the appropriate energy. 

We obviously cannot stop here (‘Electrons behave just like light’). All the particles of 
the SM must be described as excitation quanta of the corresponding quantum fields. But of 
course Feynman was somewhat overstating the case. The quanta of the electromagnetic field 
are bosons, and there is no limit on the number of them that can occupy a single quantum 
state. By contrast, the quanta of the electron field, for example, must be fermions, obeying 
the exclusion principle. In chapter 7, we shall see what modifications to the quantization 
procedure this requires. We must also introduce interactions between the excitation quanta, 
or equivalently between the quantum fields. This we do in chapter 6 for bosonic fields, and 
in chapter 7 for the Dirac and Maxwell fields thereby arriving at QED, our first quantum 
gauge field theory of the SM. 

One reason the Lagrangian formulation of classical field (or particle) physics is so power- 
fulis that symmetries can be efficiently incorporated, and their connection with conservation 
laws easily exhibited. The same is even more true in qft. For example, only in qft can the 
symmetry corresponding to electric charge conservation be simply understood. Indeed, all 
the quantum gauge field theories of the SM are deeply related to symmetries, as will become 
clear in the subsequent development. 

In some cases, however, the symmetry—though manifest in the Lagrangian—is not vis- 
ible in the usual empirical ways (conservation laws, particle multiplets and so on). Instead, 
it is ‘spontaneously (or dynamically) broken’. This phenomenon plays a crucial role in both 
QCD and the GSW theory. An aid to understanding it physically is provided by the analogy 
between the vacuum state of an interacting qft and the ground state of an interacting quan- 
tum many-body system—an insight due to Nambu (1960). We give an extended discussion 
of spontaneously broken symmetry in Part 7 of volume 2. We shall see how the neutral 
bosonic (Bogoliubov) superfluid and the charged fermionic (BCS) superconductor offer in- 
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structive working models of dynamical symmetry breaking, relevant to chiral symmetry 
breaking in QCD, and to the generation of gauge boson masses in the GSW theory. 

The road ahead is a long one, and we begin our journey at a more descriptive and pictorial 
level, making essential use of Yukawa’s remarkable insight into the quantum nature of force. 
In due course, in chapter 6, we shall begin to see how qft supplies the precise mathematical 
formulae associated with such pictures. 


1.3.2 The Yukawa theory of force as virtual quantum exchange 


Yukawa’s revolutionary paper (Yukawa 1935) proposed a theory of the strong interaction 
between a proton and a neutron, and also considered its possible extension to neutron 
B-decay. He built his theory by analogy with electromagnetism, postulating a new field of 
force with an associated new field quantum, analogous to the photon. In doing so, he showed 
with particular clarity how, in quantum field theory, particles interact by exchanging virtual 
quanta, which mediate the force. 

Before proceeding, we should emphasize that we are not presenting Yukawa’s ideas 
as a viable candidate theory of strong and weak interactions. Crucially, Yukawa assumed 
that the nucleons and his quantum (later identified with the pion) were point-like, but in 
fact both nucleons and pions are quark composites with spatial extension. The true ‘strong’ 
interaction relates to the quarks, as we shall see in section 1.3.6. There are also other details 
of his theory which were (we now know) mistaken, as we shall discuss. Yet his approach 
was profound, and—as happens often in physics—even though the initial application was 
ultimately superseded, the ideas have broad and lasting validity. 

Yukawa began by considering what kind of static potential might describe the n-p 
interaction. It was known that this interaction decreased rapidly for interparticle separation 
r > 2 fm. Hence, the potential could not be of coulombic type œ 1/r. Instead, Yukawa 
postulated an n—p potential energy of the form 


2 ,-r/a 
—9gxe 
U(r) = N (1.13) 
4t r 
where ‘gy’ is a constant analogous to the electric charge e, r = |r| and ‘a’ is a range 


parameter (~ 2 fm). This static potential satisfies the equation 


(v = z) U(r) = gid(r) (1.14) 


(see appendix G) showing that it may be interpreted as the mutual potential energy of one 
point-like test nucleon of ‘strong charge’ gn due to the presence of another point-like nucleon 
of equal charge gy at the origin, a distance r away. Equation (1.14) should be thought of 
as a finite range analogue of Poisson’s equation in electrostatics (equation (G.3)) 


V?V (r) = —p(r)/€0 (1.15) 


the delta function in (1.14) (see appendix E) expressing the fact that the ‘strong charge 
density’ acting as the source of the field is all concentrated into a single point, at the origin. 

Yukawa now sought to generalize (1.14) to the non-static case, so as to obtain a field 
equation for U (r,t). For r 4 0, he proposed the free-space equation (we shall keep factors 
of c and fh explicit for the moment) 


2 2 1 
Vasa a (1.16) 
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which is certainly relativistically invariant (see appendix D). Thus far, U is still a classical 
field. Now Yukawa took the decisive step of treating U quantum mechanically, by looking 
for a (de Broglie-type) propagating wave solution of (1.16), namely 


U x exp(ip: r/h —iEt/h). (1.17) 


Inserting (1.17) into (1.16) one finds 
== +5 (1.18) 


or, taking the positive square root, 


2 h2 1/2 
B= jer F — : 
a 


Comparing this with the standard E-p relation for a massive particle in special relativity 
(appendix D) , the fundamental conclusion is reached that the quantum of the finite-range 
force field U has a mass my given by 


nyc = —- or my = —. (1.19) 


This means that the range parameter in (1.13) is related to the mass of the quantum my 
by 
h 


myc 


a= (1.20) 
Inserting a ~ 2 fm gives my % 100 MeV, Yukawa’s famous prediction for the mass of the 
nuclear force quantum. 
Next, Yukawa envisaged that the U-quantum would be emitted in the transition n —> p, 
via a process analogous to (1.12): 
n—+p+U7 (1.21) 


where charge conservation determines the U~ charge. Yet there is an obvious difference 
between (1.21) and (1.12): (1.21) violates energy conservation since Mn < mp + mv if 
my % 100 MeV, so it cannot occur as a real emission process. However, Yukawa noted that 
if (1.21) were combined with the inverse process 


peu >n (1.22) 


then an n-p interaction could take place by the mechanism shown in figure 1.1(a); namely, 
by the emission and subsequent absorption—that is, by the exchange—of a U~ quantum. 
He also included the corresponding Ut exchange, where U* is the anti-particle of the UT, 
as shown in figure 1.1(b). 

An energy-violating transition such as (1.21) is known as a ‘virtual’ transition in 
quantum mechanics. Such transitions are routinely present in quantum-mechanical time- 
dependent perturbation theory and can be understood in terms of an ‘energy—time uncer- 
tainty relation’ 

AEAt > hi/2. (1.23) 


The relation (1.23) may be interpreted as follows (we abridge the careful discussion in 
section 44 of Landau and Lifshitz (1977)). Imagine an ‘energy-measuring device’ set up to 
measure the energy of a quantum system. To do this, the device must interact with the 
quantum system for a certain length of time At. If the energy of a sequence of identically- 
prepared quantum systems is measured, only in the limit At — oo will the same energy 
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FIGURE 1.1 
Yukawa’s single-U exchange mechanism for the n-p interaction. (a) UT exchange. (b) U+ 
exchange. 


be obtained each time. For finite At, the measured energies will necessarily fluctuate by an 
amount AF as given by (1.23); in particular, the shorter the time over which the energy 
measurement takes place, the larger the fluctuations in the measured energy. 

Wick (1938) applied (1.23) to Yukawa’s theory, and thereby shed new light on the 
relation (1.20). Suppose a device is set up capable of checking to see whether energy is, in 
fact, conserved while the US crosses over in figure 1.1. The crossing time t must be at least 
r/c, where r is the distance apart of the nucleons. However, the device must be capable of 
operating on a time scale smaller than t (otherwise it will not be in a position to detect 
the U*), but it need not be very much less than this. Thus the energy uncertainty in the 
reading by the device will be® 


AE, (1.24) 


r 
As r decreases, the uncertainty AF in the measured energy increases. If we require AE = 


myc’, then 
h 


Myc 


re (1.25) 
just as in (1.20). The ‘r’ in (1.25) is the extent of the separation allowed between the n and 
the p, such that—in the time available—the U~ can ‘borrow’ the necessary energy to come 
into existence and cross from one to the other. In this sense, r is the effective range of the 
associated force as in (1.20). 

Despite the similarity to virtual intermediate states in ordinary quantum mechanics, the 
Yukawa-Wick process is nevertheless truly revolutionary because it postulated an energy 
fluctuation AF great enough to create an as yet unseen new particle, a new state of matter. 

We proceed to explore further aspects of Yukawa’s force mechanism. The reader should 
note that throughout the remainder of this book we shall generally (unless otherwise stated) 
use units such that A = c = 1: see Appendix B. 


1.3.3 The one-quantum exchange amplitude 


Consider a particle, carrying ‘strong charge’ gy, being scattered by an infinitely massive 
(static) point-like U-source also of ‘charge’ gy as pictured in figure 1.2. From the previous 
section, we know that the potential energy in the Schrödinger equation for the scattered 
particle is precisely the U(r) from (1.13). Treating this to its lowest order in U(r) (‘Born 


3In this kind of argument, the ‘~’ sign should be understood as meaning that numerical factors of order 
1 (such as 2 or 7) are not important. The coincidence between (1.25) and (1.20) should not be taken too 
literally. Nevertheless, the physics of (1.25) is qualitatively correct. 
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FIGURE 1.2 
Scattering by a static point-like U-source. 


Approximation’—see appendix H), the scattering amplitude is proportional to the Fourier 
transform of U(r): 


fa = feUe) (1.26) 


where q is the momentum (or wavevector, since h = 1) transfer q = k — k’. The transform 
is evaluated in appendix G equation (G.24), or in problem 1.1, with the result 


IN 
= —-—-_.,.. 1.27 
fa) q? + m? ( ) 
This implies that the amplitude (in this static case) for the one-U exchange amplitude is 
proportional to —1/(q? + m?,), where q is the momentum carried by the U-quantum. 

In this scattering by an infinitely massive source of potential, the energy of the scattered 
particle cannot change. In a real scattering process such as that in figure 1.1, both energy 
and momentum can be transferred by the U-quantum—that is, q is replaced by the four- 
momentum q = (qo, q), where qo = ko — kj. Then, as indicated in appendix G, the factor 
—1/(q? +mj,) is replaced by 1/(q? — m?,) and the amplitude for figure 1.1 is, in this model, 


2 
i. (1.28) 


q2 — m? 


It will be the main burden of chapters 5 and 6 to demonstrate just how this formula is arrived 
at, using the formalism of quantum field theory. In particular, we shall see in detail how 
the propagator (q? — mz,)~' arises. For the present, we can already note (from appendix G) 
that such propagators are, in fact, momentum-space Green functions. 

In chapter 6 we shall also discuss other aspects of the physical meaning of the propagator, 
and we shall see how diagrams which we have begun to draw in a merely descriptive way 
become true ‘Feynman diagrams’, each diagram representing by a precise mathematical 
correspondence a specific expression for a quantum amplitude, as calculated in perturbation 
theory. The expansion parameter of this perturbation theory is the dimensionless number 
gx /4m appearing in the potential U(r) (cf (1.13)). In terms of Feynman diagrams, we shall 
learn in chapter 6 that one power of gyn is to be associated with each ‘vertex’ at which a 
U-quantum is emitted or absorbed. Thus successive terms in the perturbation expansion 
correspond to exchanges of more and more quanta. Quantities such as gy are called ‘coupling 
strengths’, or ‘coupling constants’. 
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FIGURE 1.3 
One photon exchange mechanism between charged leptons. 


It is not too early to emphasize one very important point to the reader: true Feynman 
diagrams are representations of momentum—space amplitudes. They are not representations 
of space-time processes: all space-time points are integrated over in arriving at the formula 
represented by a Feynman diagram. In particular, the two ‘intuitive’ diagrams of figure 1.1, 
which carry an implied ‘time-ordering’ (with time increasing to the right), are both included 
in a single Feynman diagram with propagator (1.28), as we shall see in detail (for an 
analogous case) in section 7.1. 

We now indicate how these general ideas of Yukawa apply to the actual interactions of 
quarks and leptons. 


1.3.4 Electromagnetic interactions 


From the foregoing viewpoint, electromagnetic interactions are essentially a special case 
of Yukawa’s picture, in which gł is replaced by the appropriate electromagnetic charges, 
and my > my = 0 so that a —> oo and the potential (1.13) returns to the Coulomb one, 
—e?/4rr. A typical one-photon exchange scattering process is shown in figure 1.3, for which 
the generic amplitude (1.28) becomes 

efg. (1.29) 


Note that we have drawn the photon line ‘vertically’, consistent with the fact that both time- 
orderings of the type shown in figure 1.1 are included in (1.29). In the case of electromagnetic 
interactions, the coupling strength is e and the expansion parameter of perturbation theory 
is e?/4r = a ~ 1/137 (see appendix C). 

We can immediately use (1.29) to understand the famous ~ sin™4 6/2 angular variation 
of Rutherford scattering. Treating the target muon as infinitely heavy (so as to simplify the 
kinematics), the electron scatters elastically so that go = 0 and q? = —(k—k’)? where k and 
k’ are the incident and final electron momenta. So q? = —2k7(1 — cos 0) = —4k? sin? 6/2 
where we have used the elastic scattering condition k? = k’. By inserting this into (1.29) 
and remembering that the cross section is proportional to the square of the amplitude 
(appendix H), we obtain the distribution sin™4 0/2. Thus, such a distribution is a clear 
signature that the scattering is proceeding via the exchange of a massless quantum. 

Unfortunately, the detailed implementation of these ideas to the electromagnetic inter- 
actions of quarks and leptons is complicated, because the electromagnetic potentials are 
the components of a 4-vector (see chapter 2), rather than a scalar as in (1.29), and the 
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FIGURE 1.4 
Yukawa’s U-exchange mechanism for neutron p-decay. 


quarks and leptons all have spin-4, necessitating the use of the Dirac equation (chapter 3). 
Nevertheless, (1.29) remains the essential ‘core’ of electromagnetic amplitudes. 

As far as the electromagnetic field is concerned, its 4-vector nature is actually a fun- 
damental feature, having to do with a symmetry called gauge invariance, or (better) local 
phase invariance. As we shall see in chapters 2 and 7, the form of the electromagnetic inter- 
action is very strongly constrained by this symmetry. In fact, turning the argument around, 
one can (almost) understand the necessity of electromagnetic interactions as being due to 
the requirement of gauge invariance. Most significantly, we shall see in section 7.3.1 how 
the masslessness of the photon is also related to gauge invariance. 

In chapter 8 a number of elementary electromagnetic processes will be fully analysed, 
and in chapter 11 we shall discuss higher-order corrections in QED. 


1.3.5 Weak interactions 


In a bold extension of his ‘strong force’ idea, Yukawa extended his theory to describe 
neutron 3-decay as well, via the hypothesized process shown in figure 1.4 (here and in 
figure 1.5 we revert to the more intuitive ‘time-ordered’ picture—the reader may supply the 
diagrams corresponding to the other time-ordering). As indicated on the diagram, Yukawa 
assigned the strong charge gy at the n-p end and a different ‘weak’ charge g’ at the lepton 
end. Thus the same quantum mediated both strong and weak transitions, and he had an 
embryonic ‘unified theory’ of strong and weak processes! If we take UT to be the 77, 
Yukawa’s mechanism predicts the existence of the weak decay 7~ > e7 + De. 

This decay does indeed occur, though at a much smaller rate than the main mode which 
ism — pw +P, But, apart from the now familiar problem with the compositeness of the 
nucleons and pions—this kind of unification is not chosen by Nature. Not unreasonably in 
1935, Yukawa was assuming that the range ~ miy of the strong force in n-p scattering 
(figure 1.1) was the same as that of the weak force in neutron (-decay (figure 1.4); after 
all, the latter (and more especially positron emission) was viewed as a nuclear process. But 
this is now known not to be the case: in fact, the range of the weak force is much smaller 
than nuclear dimensions—or, equivalently (see (1.19)), the masses of the mediating quanta 
are much greater than that of the pion. 

6-decay is now understood as occurring at the quark level via the W~-exchange pro- 
cess shown in figure 1.5(a). Similarly, positron emission proceeds via figure 1.5(b). Other 
‘charged current’ processes all involve W*-exchange, generalized appropriately to include 


Particle Interactions in the Standard Model 19 


(a) () 


FIGURE 1.5 
(a) G-decay and (b) et emission at the quark level, mediated by WF. 


FIGURE 1.6 
Z°-exchange process. 


flavour mixing effects (see volume 2). ‘Neutral current’ processes involve exchange of the 
Z°-quantum; an example is given in figure 1.6. The quanta WF, Z° therefore mediate these 
weak interactions as does the photon for the electromagnetic one. Like the photon, the W 
and Z fields are the quanta of 4-vector fieldstand have spin 1, but unlike the photon, the 
masses of the W and Z are far from zero—in fact, Mw ~ 80 GeV and Mz ~ 91 GeV. So 
the range of the force is ~ Mg ~ 2.5 x 10718 m, much less than typical nuclear dimensions 
(~ few x10~1° m). This, indeed, is one way of understanding why the weak interactions 
appear to be so weak; this range is so tiny that only a small part of the hadronic volume is 
affected. 

Thus Nature has not chosen to unify the strong and weak forces via a common mediating 
quantum. Instead, it has turned out that the weak and strong forces (see section 1.3.6) are 
both gauge theories, generalizations of electromagnetism, as will be discussed in volume 2. 
This raises the possibility that it may be possible to ‘unify’ all three forces. 

Some initial idea of how this works in the ‘electroweak’ case may be gained by consid- 
ering the amplitude for figure 1.5(a) in the low —q? limit. In a simplified version, which is 


4This is dictated by the phenomenology of weak interactions—see chapter 20 in volume 2. 
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FIGURE 1.7 
Point-like four-fermion interaction. 


analogous to (1.29) and ignores the spin of the W and the leptons, the amplitude is 
9 / (P — Mw) (1.30) 


where g is a ‘weak charge’ associated with W-emission and absorption. In actual 8-decay, 
the square of the 4-momentum transfer q° is tiny compared to M¢,, so that (1.30) becomes 
independent of q? and takes the constant value —g?/M¢,. This corresponds, in configu- 
ration space, to a point-like interaction (the Fourier transform of a delta function is a 
constant). Just such a point-like interaction, shown in figure 1.7, had been postulated by 
Fermi (1934a, b) in the first theory of $-decay: it is a ‘four-fermion’ interaction with strength 
Gr. The value of Gp can be determined from measured (-decay rates. The dimensions of 
Gr turn out to be energy x volume, so that Gr/(hc)? has dimension (energy~”). In our 
units A = c= 1, the numerical value of Gp is 


Gr ~ (300 GeV)~. (1.31) 
If we identify this constant with g?/M¢, we obtain 


g? ~ M¥,/(300 GeV)? ~ 0.064 (1.32) 


2 2 


a value quite similar to that of the electromagnetic charge e as determined from ef = 
4ra ~ 0.09. Though this is qualitatively correct, we shall see in volume 2 that the actual 
relation, in the electroweak theory, between the weak and electromagnetic coupling strengths 
is somewhat more complicated than the simple equality ‘g = e’. (Note that a corresponding 
connection with Fermi’s theory was also made by Yukawa!) 

We can now understand the ‘weakness’ of the weak interactions from another viewpoint. 
For q? < M@,, the ratio of the electromagnetic amplitude (1.29) to the weak amplitude 
(1.30) is of order qg?/M¥¢,, given that e ~ g. Thus despite having an intrinsic strength 
similar to that of electromagnetism, weak interactions will appear very weak at low energies 
such that q? < Må. At energies approaching My, however, weak interactions will grow in 
importance relative to electromagnetic ones and, when q? > MR, weak and electromagnetic 
interactions will contribute roughly equally. 

‘Similar’ coupling strengths are still not ‘unified’, however. True unification only occurs 
after a more subtle effect has been included, which goes beyond the one-quantum exchange 
mechanism. This is the variation or ‘running’ of the coupling strengths as a function of 
energy (or distance), caused by higher-order processes in perturbation theory. This will be 
discussed more fully in chapter 11 for QED, and in volume 2 for the other gauge couplings. 
It turns out that the possibility of unification depends crucially on an important difference 
between the weak interaction quanta W= (to take the present example) and the photons 
of QED, which has not been apparent in the simple -decay processes considered so far. 
The W’s are themselves ‘weakly charged’, acting as both carriers and sources of the weak 
force field, and they therefore interact directly amongst themselves even in the absence 
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of other matter. By contrast, photons are electromagnetically neutral and have no direct 
self-interactions. In theories where the gauge quanta self-interact, the coupling strength 
decreases as the energy increases, while for QED it increases. It is this differing ‘evolution’ 
that tends to bring the strengths together, ultimately. 

Even granted similar coupling strengths and the fact that both are 4-vector fields, the 
idea of any electroweak unification appears to founder immediately on the markedly different 
ranges of the two forces or, equivalently, of the masses of the mediating quanta (m, = 
0, Mw ~ 80 GeV!). This difficulty becomes even more pointed when we recall that, as 
previously mentioned, the masslessness of the photon is related to gauge invariance in 
electrodynamics: how then can there be any similar kind of gauge symmetry for weak 
interactions, given the distinctly non-zero masses of the mediating quanta? Nevertheless, in 
one of the great triumphs of twentieth century theoretical physics, it 7s possible to see the 
two theories as essentially similar gauge theories, the gauge symmetry being ‘spontaneously 
broken’ in the case of weak interactions. This is a central feature of the GSW electroweak 
theory. An indication of how gauge quanta might acquire mass will be given in section 11.4 
but a fuller explanation, with application to the electroweak theory, is reserved for volume 2. 
We will have a few more words to say about it in section 1.4.1. 


1.3.6 Strong interactions 


We turn to the contemporary version of Yukawa’s theory of strong interactions, now viewed 
as occurring between quarks rather than nucleons. Evidence that the strong interquark 
force is in some way similar to QED comes from nucleon-nucleon (or nucleon-antinucleon) 
collisions. Regarding the nucleons as composites of point-like quarks, we would expect to 
see prominent events at large scattering angles corresponding to ‘hard’ q-q collisions (re- 
call Rutherford’s discovery of the nucleus). Now the result of such a hard collision would 
normally be to scatter the quarks to wide angles, ‘breaking up’ the nucleons in the pro- 
cess. However, quarks (except for the t quark) are not observed as free particles. Instead, 
what appears to happen is that, as the two quarks separate from each other, their mutual 
potential energy increases—so much so that, at a certain stage in the evolution of the scat- 
tering process, the energy stored in the potential converts into a new qq pair. This process 
continues, with in general many pairs being produced as the original and subsequent pairs 
pull apart. By a mechanism which is still not quantitatively understood in detail, the pro- 
duced quarks and anti-quarks (and the original quarks in the nucleons) bind themselves 
into hadrons within an interaction volume of order 1 fm®, so that no free quarks are finally 
observed, consistent with ‘confinement’. Very strikingly, these hadrons emerge in quite well- 
collimated ‘jets’, suggesting rather vividly their ancestry in the original separating qq pair. 
Suppose, then, that we plot the angular distribution of such two jet events’; it should tell 
us about the dynamics of the original interaction at the quark level. 

Figure 1.8 shows such an angular distribution from proton—antiproton scattering, so that 
the fundamental interaction in this case is the elastic scattering process qq + qq. Here @ is 
the scattering angle in the qq centre of mass system (CMS). Amazingly, the 0-distribution 
follows almost exactly the ‘Rutherford’ form sin~* 6/2. 

We saw how, in the Coulomb case, this distribution could be understood as arising from 
the propagator factor 1/q?, which itself comes from the 1/r potential associated with the 
massless quantum involved, namely the photon. In the present case, the same 1/q? factor 
is responsible. Here, in the qq centre of mass system, k and —k are the momenta of the 
initial q and q, while k’ and —k’ are the corresponding final momenta. Once again, for 
elastic scattering there is no energy transfer, and q? = —q? = —(k — k’)? = —4k? sin? 6/2 
as before, leading to the sin~* 8 /2 form on squaring 1/g?. Once again, such a distribution 
is a clear signal that a massless quantum is being exchanged — in this case, the gluon. 
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FIGURE 1.8 

Angular distribution of two-jet events in pp collisions (Arnison et al. 1985) as a function 
of cos 0, where 0 is the CMS scattering angle. The broken curve is the prediction of QCD, 
obtained in the lowest order of perturbation theory (one-gluon exchange); it is virtually in- 
distinguishable from the Rutherford (one-photon exchange) shape sin~* 0/2. The full curve 
includes higher order QCD corrections. 


It might then seem to follow that, as in the case of QED, the QCD interaction has 
infinite range. But this cannot be right; the strong forces do not extend beyond the size of 
a typical hadron, which is roughly 1 fm. Indeed, the QCD force is mediated by the massless 
spin-1 gluon, and QCD is also a gauge theory; but the form of the QCD interaction, though 
somewhat analogous to QED, is more complicated, and the long range behaviour of the 
force is very different. 

As we have seen, each quark comes in three colours, and the QCD force is sensitive to 
this colour label: the gluons effectively ‘carry colour’ back and forth between the quarks, as 
shown in the one-gluon exchange process of figure 1.9. Because the gluons carry colour, they 
can interact with themselves, like the W’s and Z’s of the GSW theory. As in that case, these 
gluonic self-interactions cause the QCD interaction strength to decrease at short distances 
(or high energies), ultimately tending to zero, the property known as asymptotic freedom. 
So in ‘hard’ collisions occurring at short inter-particle distances, the one-gluon exchange 
mechanism gives a good first approximation to the data. But the force grows much stronger 
as the quarks separate from each other, and perturbation theory is no longer a reliable 
guide. In fact, it seems that a new, non-perturbative, effect occurs—namely confinement. 
Once again, a gauge theory, with formal similarity to QED, has very different physical 
consequences. 

A phenomenological qq (or qq) potential which is often used in quark models has the 
form A 


where the first term, which dominates at small r, arises from a single-gluon exchange so that 
a ~ g2, where the strong (QCD) charge is gs. The second term models confinement at larger 
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FIGURE 1.9 

Strong scattering via gluon exchange. At the top vertex, the ‘flow’ of colour is b (quark) 
— r (quark) + Tb (gluon) and at the lower vertex the flow is Tb (gluon) + r (quark) > b 
(quark). 


values of r. Such a potential provides quite a good understanding of the gross structure of 
the cé and bb systems (see problem 1.5). A typical value for b is 0.85 GeV fm~! (which 
corresponds to a constant force of about 14 tonnes!). Thus at r ~ 2 fm, there is enough 
energy stored to produce a pair of the lighter quarks. This ‘linear’ part of the potential 
cannot be obtained by considering the exchange of one, or even a finite number of, gluons: 
in other words, not within an approach based on perturbation theory. 

It is interesting to note that the linear part of the potential may be regarded as the so- 
lution of the one-dimensional form of V?V = 0, namely d?V/dr? = 0; this is in contrast to 
the Coulombic 1/r part, which is a solution (except at r = 0) to the full three-dimensional 
Laplace equation. This suggests that the colour field lines connecting two colour charges 
spread out into all of space when the charges are close to each other, but are somehow 
‘squeezed’ into an elongated one-dimensional ‘string’ as the distance between the charges be- 
comes greater than about 1 fm. In the second volume, we shall see that numerical simulations 
of QCD, in which the space-time continuum is represented as a discrete lattice of points, 
indicate that such a linear potential does arise when QCD is treated non-perturbatively. It 
remains a challenge for theory to demonstrate that confinement follows from QCD. 

It is believed that gluons too are confined by QCD, so that—like quarks—they are not 
seen as isolated free particles. But they too ‘hadronize’ after being produced in a primitive 
short-distance collision process, as happens in the case of q’s and q’s. Such ‘gluon jets’ 
provide indirect evidence for the existence and properties of gluons, as we shall see in 
volume 2. 

This is an appropriate moment at which to emphasize what appears to be a crucial 
distinction between the three ‘charges’ (electromagnetic, weak and strong) on the one hand, 
and the various flavour quantum numbers on the other. The former have a dynamical 
significance, whereas the latter do not. In the case of electric charge, for example, this means 
simply that a particle carrying this property responds in a definite way to the presence of 
an electromagnetic field and itself creates such a field. No such force fields are known for 
any of the flavour numbers, which are (at present) purely empirical classification devices, 
without dynamical significance. 
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TABLE 1.3 
Properties of SM gauge bosons. 
Particle Polarization states Mass Width/Lifetime 
(photon) 2 0 (theoretical) stable 
g (gluon) 2 0 (theoretical) stable 
W= 3 80.3774 0.012GeV Tw = 2.085 + 0.042 GeV 
Z° 3 91.1876 + 0.0021 GeV Tz = 2.4952 + 0.0023 GeV 


1.3.7 The gauge bosons of the Standard Model 


We can now gather together the mediators of the SM forces. They are all gauge bosons, 
meaning that they are the quanta of various 4-vector gauge fields. For example, the photon 
is the quantum of the electromagnetic (Maxwell) 4-vector potential A“(x) (see chapter 
2 and section 7.3), which is the simplest gauge field. The gluon is the quantum of the 
QCD potential A(x), where the colour index a runs from 1 to 8. The reason there are 
8 of them may be guessed from figure 1.9: each gluon can be thought of as carrying one 
colour-anticolour combination, such as rb, bg, and so on; the symmetric combination Tr 
+bb +g is totally colourless and is discarded. In the GSW electroweak theory, there are 
four gauge fields, W/'(x) where i runs from 1 to 3, and BY (x) which is analogous to A“ (x). 
One linear combination of W$’ (x) and B” (x) is associated with the photon field A” (x); the 
orthogonal combination is associated with the Z” (x) field whose quantum is the Z°. The 
charged carriers W~ are associated with the W# (x) and W3'(x) components of the W¥ (x) 
field. 

We shall assume that the mass of the photon and of the gluon is exactly zero. This can 
never be established experimentally, of course: the current experimental limit on the photon 
mass is that it is less than 1 x 10718 (Workman et al. (2022)). All gauge fields have spin 1 (in 
units of ñ). Ordinarily, a spin-1 particle would be expected to have three polarization states, 
according to quantum mechanics. However, it is a general result that in the massless case 
the quanta have only two polarization states, both transverse to the direction of motion; 
the longitudinally polarized state is absent (this property, familiar for the corresponding 
classical fields which are purely transverse, will be discussed in section 7.3.1). By contrast, 
all three polarization states are present for the massive gauge bosons. 

The photon and the gluon are stable particles. The WŁ and Z° particles decay with 
total widths of the order of 2 GeV (lifetimes ~ 0.3 x 10774 s). Although this is significantly 
shorter than typical strong interaction decay lifetimes, these are of course weak decays, the 
rate being enhanced by the large energy release. 

Table 1.3 lists the properties of the SM gauge bosons; the masses and widths are taken 
from Workman et al. (2022). It should be noted that in April 2022, too late to be included 
in the world average value for My given in Table 1.3, the CDF collaboration published 
(Aaltonen et al. 2022) a new result for the W mass based on their full Run 2 data set with 
much reduced uncertainty: Mw = 80.4335 + 0.0094 GeV, which disagrees significantly with 
the value in Table 1.3. More recently, however, the ATLAS Collaboration (ATLAS 2023) 
reported the preliminary result Mw = 80.360 + 0.016 GeV, which agrees with the value in 
Table 1.3. The discrepancy remains to be resolved. 
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1.4 Renormalization and the Higgs sector of the Standard Model 
1.4.1 Renormalization 


So far we have been discussing processes in which only one particle is exchanged. These 
will generally be the terms of lowest order in a perturbative expansion in powers of the 
coupling strength. But we must clearly go beyond lowest order, and include the effects 
of multi-particle exchanges. We shall explain how to do this in chapter 10, for a simple 
scalar field theory. Such multi-particle exchange amplitudes are given by integrals over the 
momenta of the exchanged particles, constrained only by four-momentum conservation (no 
integral arises in the case of the exchange of a single particle, because its four-momentum 
is fixed in terms of the momenta of the scattering particles, as in section 1.2.3). It turns out 
that the integrals nearly always diverge as the momenta of the exchanged particles tend to 
infinity. Nevertheless, as we shall explain in chapter 10, this theory can be reformulated, 
by a process called renormalization, in such a way that all multi-particle (higher-order) 
processes become finite and calculable—a quite remarkable fact, and one that is of course 
an absolutely crucial requirement in the case of the Standard Model interactions, where the 
relevant data are precise enough to test the accuracy of the theory well beyond lowest order, 
particularly in the case of QED (see chapter 11). The price to be paid for this taming of 
the divergences is just that the basic parameters of the theory, such as masses and coupling 
constants, have to be treated as parameters to be determined by comparison to the data, 
and cannot themselves be calculated. 

But some theories cannot be reformulated in this way—they are non-renorm-alizable. 
A simple test for whether a theory is renormalizable or not will be discussed in section 
11.8: if the coupling constant has dimensions of a mass to an inverse power, the theory is 
non-renormalizable. An example of such a theory is the original four-Fermi theory of weak 
interactions, where the coupling constant Gr has the dimensions of an inverse square mass 
(or energy) as we saw in (1.31). We will look at this theory again in section 11.8, but the 
essential point for our purpose now is that the dimensionful coupling constant introduces 
an energy scale into the problem, namely Grt? ~ 300 GeV. It seems reasonable to infer 
that a more relevant measure of the interaction strength will be given by the dimensionless 
number EGY n where F is a characteristic physical energy scale of any weak process under 
consideration—for example, the energy in the centre of momentum frame in a two-particle 
scattering process, at least at energies much greater than the particle masses. Then, for 
energies very much less than Gp! the effective strength will be very weak, and the 
lowest order term in perturbation theory will work fine; this is how the Fermi theory was 
used, for many years. But as the energy increases, what happens is that more and more 
parameters have to be taken from experiment in order to control the divergences. As the 
energy approaches Gp a r the theory becomes totally non-predictive and breaks down. Thus 
renormalizability is regarded as highly desirable in a theory. 

One might hope to come up with a renormalizable theory of weak interactions by replac- 
ing the four-fermion interaction by a Yukawa-like mechanism, with exchange of a quantum 
of mass M and dimensionless coupling y, say. Then just as in (1.32) we would identify 
Gr ~ y?/M? at low energies. However, as we have seen, phenomenology implies that the 
massive exchanged quantum must have spin 1. Unfortunately, this type of straightforward 
massive spin-1 theory is not renormalizable either, as we shall discuss in chapter 22. The 
trouble can be traced directly to the existence of the longitudinal polarization state which, 
as noted previously, is present for a massive spin-1 particle. If the exchanged spin-1 quantum 
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were massless, as in QED, it would lack that third polarization state, and the theory would 
be renormalizable. But weak interaction facts dictate both non-zero mass and spin-1. 

In the case of QED, there is a symmetry principle behind both the zero mass of the 
photon and the absence of the longitudinal polarization state: this symmetry is gauge in- 
variance as we shall explain in section 7.3.1. It turns out that this symmetry is vital in 
rendering QED renormalizable. It is natural then to ask whether in the case of QED, a 
situation ever arises where the photon acquires mass, while retaining fully gauge-invariant 
interactions—and hence renormalizability (we would hope). If so, we would then have an 
analogue of what is needed for a renormalizable theory of weak interactions. The answer 
is that this can indeed happen, but it requires some extra dynamics to do it. Nature has 
actually provided us with a working model of what we want, in the phenomenon of super- 
conductivity. There, the Meissner effect can be interpreted as implying that the photons 
propagating in a thin surface layer of the material have non-zero mass (see section 19.2). The 
dynamics behind this is subtle, and required many years of theoretical efforts before it was 
finally understood by Bardeen, Cooper and Schrieffer (1957). In simple terms, the mecha- 
nism is a two-step process. First, lattice interactions cause electrons to bind into pairs; then 
these pairs undergo Bose-Einstein condensation. This “condensate” is the Bardeen-Cooper- 
Schrieffer (BCS) superconducting ground state. The essential point is that although the 
electromagnetic interactions are fully gauge invariant, the ground state is not. When a 
symmetry is broken by the ground state, it is said to be ‘spontaneously’ broken. We shall 
provide an introduction to the BCS ground state in chapter 17 of volume 2. 

The BCS theory is an example of spontaneous symmetry breaking occurring dynamically 
(through the particular lattice interactions). Many of the physically important phenomena 
can, however, be very satisfactorily described in terms of an effective theory, which treats 
only the electrodynamics of the condensate. Such a description was proposed by Ginzburg 
and Landau (1950), well before the BCS paper, in fact. 

How can this be applied in particle physics? Recall the idea, mentioned in section 1.3.1, 
that the analogue of the many-body ground state is the qft vacuum (Nambu 1961). In 
the SM, the weak interactions are indeed described by a gauge-invariant theory, and the 
assumption is made that the vacuum breaks the gauge symmetry. The simplest way this 
idea can be implemented is along the lines of the Ginzburg-Landau theory, as suggested 
by Weinberg (1967) and by Salam (1968), and their proposal is embodied in the Glashow- 
Weinberg-Salam electroweak theory, which is part of the SM. It requires the introduction 
of four new spin-0 fields, which are called Higgs fields (Higgs 1964, Englert and Brout 1964, 
Guralnik et al. 1964), and which we may think of as playing the role of the BCS conden- 
sate (but not for electromagnetism, of course). The combined theory of quarks, leptons, 
electroweak gauge fields, and Higgs fields is gauge invariant, but one of the Higgs fields 
is supposed to have a non-zero average value in the physical vacuum, which breaks the 
gauge symmetry. The other three Higgs fields effectively become the longitudinal parts of 
the massive spin-1 WŁ and Z° fields, while the quantized excitations of the fourth Higgs 
field away from its vacuum value appear physically as neutral spin-0 particles, called Higgs 
bosons (Higgs 1964). 

Apart from giving mass to the WË and Z°, the Higgs fields have more work to do. The 
electroweak gauge symmetry is exact only if all the fermion masses are zero; this is because it 
is a chiral symmetry (similar to, but not the same as, the chiral symmetry of QCD mentioned 
in section 1.2.2). Once again, this chiral gauge symmetry is essential to the renormalizability 
of the theory: if the fermion masses are incorporated in the usual way as parameters in the 
Lagrangian, the latter is no longer gauge invariant and the theory is non-renormalizable. 
In the SM, this problem is solved by having no fermion masses in the Lagrangian, and by 
postulating gauge-invariant Yukawa interactions between the fermions and the Higgs fields, 
which are arranged in such a way that, when the Higgs field gets a vacuum expectation 
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value, the interaction terms yield just the fermion masses. So again, the symmetry breaking 
is economically blamed on the same property of the vacuum. When the Higgs field oscillates 
away from its vacuum value, the result will be residual Yukawa interactions between the 
fermions and the Higgs boson, which will have the defining characteristic that each fermion 
will interact with the Higgs boson with a strength proportional to its (i.e. the fermion’s) 
mass. As noted earlier, this feature of the Higgs sector violates lepton universality. 

We have emphasized the role that the Higgs fields play in the renormalizability of 
the GSW theory. The all-important proof of that renormalizability was given by ’t Hooft 
(1971b), and he also proved the renormalizability of QCD (1971a); see also ’t Hooft and 
Veltman (1972). 

The SM Higgs sector is the simplest one that will do the job; more complicated versions 
are possible. Perhaps the Higgs field is a composite formed in some new heavy fermion- 
antifermion dynamics, reminiscent of BCS pairing. In any case, the SM Higgs sector is 
there to be explored experimentally. In the following section, we shall discuss briefly what is 
presently known about the SM Higgs boson, postponing a fuller discussion until we present 
the GSW theory in chapter 22. 

Before ending this section, we must note that modern renormalization theory is con- 
cerned with more than perturbative calculability. The renormalization group and related 
ideas provide powerful tools for ‘improving’ perturbation theory, by systematically resum- 
ming terms which (in the particle physics case) dominate at short distances. Prominent 
among the results of this analysis (see chapters 15 and 16) are the concepts of energy- 
dependent (“running”) masses and coupling strengths, and the calculation of QCD correc- 
tions to parton-model predictions. 


1.4.2 The Higgs boson of the Standard Model 


According to the SM, just one neutral spin-0 Higgs boson is expected; its mass my is not 
predicted by the theory. The experimental discovery of the SM Higgs boson was a major 
goal of several generations of accelerators: the LEP ete~ collider at Cern, the Tevatron 
pp collider at Fermilab, and now the LHC pp collider at Cern. Bounds on the Higgs mass 
could be obtained directly, through searching for its production and subsequent decay; 
non-observation led to a lower bound for my. There were also indirect constraints, coming 
from fits to precision measurements of electroweak observables. The latter are sensitive to 
higher-order corrections which involve the Higgs boson as a virtual particle; these depend 
logarithmically on the unknown parameter my and gave upper bounds on my, assuming, 
of course, that the SM was correct. A lower bound my > 114.4 GeV was set at LEP (LEP 
2003) by combining data on direct searches. Combining this with a global fit to precision 
electroweak data, an upper bound my < 186 GeV was obtained (Nakamura et al. 2010). 

By early 2012, the combined results of the CDF and DO experiments at the Tevatron, 
and the ATLAS and CMS experiments at the LHC, excluded an my value in the interval 
(approximately) 130 GeV to 600 GeV, at 95 % C.L. Finally, in July 2012 the ATLAS (Aad 
et al. 2012) and CMS (Chatrchyan et al. 2012) collaborations announced the discovery, with 
a significance of 5 ø, of a neutral boson resonance state with a mass in the range 125-126 
GeV, its production and decay rates being broadly compatible with the predictions for the 
SM Higgs boson. 

In the decade following this landmark discovery, data collected during Run 1 (7 and 8 
TeV) and Run 2 (13 TeV) at the LHC have established that in all production and decay 
modes measured so far, the results are found to be consistent, within the experimental and 
theoretical uncertainties,with the predictions of the SM. We shall discuss this in more detail 
in volume 2. For the moment, we highlight some of the most important results. 
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The mass of the Higgs boson has been measured at the 0.1% level via the decays H + yy 
and H — 4 leptons (Sirunyan et al. 2021b, Aaboud et al. 2018c). The value now listed in 
the Particle Data Group tables is my = 125.25 + 0.17 GeV (Workman et al. 2022). The 
SM prediction for the total width of the Higgs boson is about 4 MeV, which is three 
orders of magnitude smaller than the experimental mass resolution. An indirect method 
gave the result Ty = 3.2735 MeV (Sirunyan et al. 2019). The Yukawa couplings of the 
Higgs boson to fermions are of fundamental importance because they are directly related 
to the SM mechanism for giving masses to the fermions via the spontaneous breaking of 
the electroweak gauge symmetry. They can be measured from the decays of the Higgs 
boson to fermion-antifermion pairs. As mentioned earlier, these couplings are proportional 
to the fermion masses, and so the fermions of the third generation, which have the largest 
masses, will be the most easily measured. Clear evidence has been obtained for the Higgs 
boson decaying to a pair of b quarks (Aaboud et al. 2018b, Sirunyan et al. 2018b), and 
to a pair of r leptons (Aaboud et al. 2019, Sirunyan et al. 2018c), with couplings in good 
agreement with the SM values. The production of a Higgs boson in association with a pair 
of t quarks (Aaboud et al. 2018a, Sirunyan et al. 2018a, 2020) enables a measurement of 
the coupling to the t quark, again in good agreement with the SM. The current precision 
on the measurements of these third generation couplings is of the order of 10-20 %. There 
is now evidence for the SM coupling to a second-generation fermion, the muon, via the H 
— muon pair channel (Sirunyan et al. 2021a). 

It is also necessary to establish the spin (J), parity (P), and charge conjugation (C) 
quantum numbers of the Higgs boson. The observation of the decays H > yy (Sirunyan et 
al. 2021b, Aaboud et al. 2018c) restricts the spin to 0 or 2 (Landau 1948, Yang 1950). ATLAS 
(Aad et al. 2015) and CMS (Khachatryan et al. 2015) reported strong evidence for spin-0 
and even parity. Since the photon has C = —1, and C is a multiplicative quantum number, 
C must be +1 for the Higgs boson. The evidence therefore supports the SM assignment JPO 
= 0** for the Higgs boson. 

The excellent performance of the LHC and of the ATLAS and CMS detectors, together 
with corresponding theoretical efforts, have confirmed the compatibility of the resonance 
state at 125.25 GeV with the Higgs boson of the SM. There is as yet no evidence that this 
state is a composite object, or that its couplings deviate from the SM values. It remains to 
be seen whether future data of greater precision will alter these conclusions. 


1.5 Summary 


The SM provides a relatively simple picture of quarks and leptons and their non- 
gravitational interactions. The quark colour triplets are the basic source particles of the 
gluon fields in QCD, and they bind together to make hadrons. The weak interactions involve 
quark and lepton doublets—for instance the quark doublet (u,d) and the lepton doublet 
(ve, e7) of the first generation. These are sources for the WF and Z° fields. Charged fermions 
(quarks and leptons) are sources for the photon field. All the mediating force quanta have 
spin-l. The weak and strong force fields are generalizations of electromagnetism; all three 
are examples of gauge theories, but realized in subtly different ways. 

In the following chapters our aim will be to lead the reader through the mathematical 
formalism involved in giving precise quantitative form to what we have so far described only 
qualitatively and to provide physical interpretation where appropriate. In the remainder of 
part 1 of the present volume, we first show how Schrédinger’s quantum mechanics and 
Maxwell’s electromagnetic theory may be combined as a gauge theory—in fact the simplest 
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example of such a theory. We then introduce relativistic quantum mechanics for spin-0 
and spin-4 particles, and include electromagnetism via the gauge principle. In part 2, we 
develop the formalism of quantum field theory, beginning with scalar fields and moving on 
to QED. This is then applied to many simple (‘tree level’) QED processes in part 3. In the 
final part 4, we present an introduction to renormalization at the one-loop level, including 
renormalization of QED. The more complicated gauge theories of QCD and the electroweak 
theory are reserved for volume 2. 


Problems 


1.1 Evaluate the integral in (1.26) directly. [Hint: Use spherical polar coordinates with the 
polar axis along the direction of q, so that d?r = r°dr sin@ d0 dọ, and exp(iq-r) = 
exp(ilq|r cos @). Make the change of variable x = cos 0, and do the ¢ integral (trivial) and 
the integral. Finally do the r integral.] 


1.2 Using the concept of strangeness conservation in strong interactions, explain why the 
threshold energy (for 7~ incident on stationary protons) for 


a +p —> K? + anything 


is less than for 


a +p —> K° + anything 
assuming both processes proceed through the strong interaction. 


1.3 Note: the invariant square p° of a 4-momentum p = (E, p) is defined as p? = E? — p°. 
We remind the reader that h = c = 1 (see Appendix B). 


(i) An electron of 4-momentum k scatters from a stationary proton of mass M via 
a one-photon exchange process, producing a final hadronic state of 4-momentum 
p', the final electron 4-momentum being k’. Show that 


p? = +2M(E — E') + M? 


where q? = (k — k’)?, and E and E” are the initial and final electron energies, 
respectively, in this frame (i.e. the one in which the target proton is at rest). Show 
that if the electrons are highly relativistic then q? = —4E F’ sin? 0/2, where 0 is 
the scattering angle in this frame. Deduce that for elastic scattering Æ’ and 6 are 


related by 
2E 
E' =x / (1+ 5736/2) 


(ii) Electrons of energy 4.879 GeV scatter elastically from protons, with 0 = 10°. 
What is the observed value of E’? 


(iii) In the scattering of these electrons, at 10°, it is found that there is a peak of 
events at E’ = 4.2 GeV. What is the invariant mass of the produced hadronic 
state (in MeV)? 


(iv) Calculate the value of E’ at which the ‘quasi-elastic peak’ will be observed, when 
electrons of energy 400 MeV scatter at an angle 0 = 45° from a He nucleus, as- 
suming that the struck nucleon is at rest inside the nucleus. Estimate the broad- 
ening of this final peak caused by the fact that the struck nucleon has, in fact, a 
momentum distribution by virtue of being localized within the nuclear size. 
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1.4 
(i) In a simple non-relativistic model of a hydrogen-like atom, the energy levels are 
given by 
x2 Z2 
p, = 22h 
2n? 


where Z is the nuclear charge and u is the reduced mass of the electron and 
nucleus. Calculate the splitting in eV between the n = 1 and n = 2 states in 
positronium, which is an ete” bound state, assuming this model holds. 


(ii) In this model, the ete~ potential is the simple Coulomb one 


e a 
Aneor r 


Suppose that the potential between a heavy quark Q and an anti-quark Q was 


Qs 


r 


where as is a ‘strong fine structure constant’. Calculate values of a, (different in 
(a) and (b)) corresponding to the information (the quark masses are phenomeno- 
logical “quark model” masses): 


(a) the splitting between the n = 2 and n = 1 states in charmonium (cē) is 
588 MeV, and me = 1870 MeV; 

(b) the splitting between the n = 2 and n = 1 states in the upsilon series (bb) is 
563 MeV and mp = 5280 MeV. 


(iii) In positronium, the n = 13S; and n = 11!So states are split by the hyperfine 

interaction, which has the form faim -02 where me is the electron mass and 

o and o2 are the spin matrices for the e~ and e™, respectively. Calculate the 
expectation value of o1 - a2 in the 3S4 and tSo states, and hence evaluate the 
splitting between these levels (calculated in lowest order perturbation theory) in 
eV. [Hint: the total spin S is given by S = $(01 + 02). So S? = }(o7 +03 + 
201-02). Hence the eigenvalues of o1 - o2 are directly related the those of S? 


(iv) Suppose an analogous ‘strong’ hyperfine interaction existed in the c€ system, and 
was responsible for the splitting between the n = 13S, and n = 1 Sọ states, which 
is 116 MeV experimentally (i.e. replace a by ag and Mme by Mme = 1870 MeV). 
Calculate the corresponding value of ag. 


1.5 The potential between a heavy quark Q and an anti-quark Q is found empirically to be 
well represented by 


V(r) = + br 


where a, © 0.5 and b ~ 0.18 GeV”. Indicate the origin of the first term in V(r), and the 
significance of the second. 

An estimate of the ground-state energy of the bound QQ system may be made as follows. 
For a given r, the total energy is 


2 
Elr) = 2m- & +br + 
r m 


where m is the mass of the Q (or Q) and p is its momentum (assumed non-relativistic). 
Explain why p may be roughly approximated by 1/r, and sketch the resulting E(r) as a 
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function of r. Hence show that, in this approximation, the radius of the ground state, ro, is 
given by the solution of 


2 Q 
a T 
mrp ró 


Taking m = 1.5 GeV as appropriate to the ct system, verify that for this system 
(1/ro) ~ 0.67 GeV 


and calculate the energy of the ct ground state in GeV, according to this model. 
An excited ct state at 3.686 GeV has a total width of 278 keV, and one at 3.77 GeV 
has a total width of 24 MeV. Comment on the values of these widths. 


1.6 The Hamiltonian for a two-state system using the normalized base states |1), |2} has 


the form 
(1|JA|1) (1|H|2) \ — ( —acos26 asin 20 
(2|H|1)  (2|H|2) ] asin20 acos20 


where a is real and positive. Find the energy eigenvalues E} and E_, and express the 
corresponding normalized eigenstates |+) and |—) in terms of |1) and |2}. 

At time t = 0 the system is in state |1). Show that the probability that it will be found 
to be in state |2) at a later time t is 


sin? 20 sin? (at). 


Discuss how a formalism of this kind can be used in the context of neutrino oscillations. 
How might the existence of neutrino oscillations explain the solar neutrino problem? (This 
will be discussed in chapter 21 of volume 2.) 


1.7 In an interesting speculation, it has been suggested (Arkani-Hamad et al. 1998, 1999, 
Antoniadis et al. 1998) that the weakness of gravity as observed in our (apparently) three- 
dimensional world could be due to the fact that gravity actually extends into additional 
‘compactified’ dimensions (that is, dimensions which have the geometry of a circle, rather 
than of an infinite line). For the particles and forces of the Standard Model, however, such 
leakage into extra dimensions has to be confined to currently probed distances, which are 
of order Mae 


(i) Consider Newtonian gravity in (3+d) spatial dimensions. Explain why you would 
expect that the gravitational potential will have the form 


Mı mG 
VN,s+a(r) = eS (1.34) 


[Think about how the ‘1/r?’ fall-off of the force is related to the surface area of a 
sphere in the case d = 0. Note that the formula works for d = —2! What happens 
in the case d = —1?} 


(ii) Show that Gy.34a has dimensions (mass)~(??+%. This allows us to introduce 
the ‘true’ Planck scale—i.e. the one for the underlying theory in 3 + d spatial 
dimensions—as Gy. 344 = (Mp 344) @CT®. 


(iii) Now suppose that the form (1.34) only holds when the distance r between the 
masses is much smaller R, the size of the compactified dimensions. If the masses 
are placed at distances r > R, their gravitational flux cannot continue to pen- 
etrate into the extra dimensions, and the potential (1.34) should reduce to the 
familiar three-dimensional one, so we must have 


mm Gn 34a 1 


7 (1.35) 


Vy 3+a(r > R) = 


32 
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Show that this implies that 
Mp = Mb.34a(RMp,3+a)* (1.36) 
Suppose that d = 2 and R ~ 1mm. What would Mp 3+a be, in TeV? Suggest ways 


in which this theory might be tested experimentally. Taking Mp 34a ~ 1 TeV, 
explore other possibilities for d and R. 
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Electromagnetism as a Gauge Theory 


SSS _ 
2.1 Introduction 


The previous chapter introduced the basic ideas of the Standard Model (SM) of parti- 
cle physics, in which quarks and leptons interact via the exchange of gauge field quanta. 
We must now look more closely into what is the main concern of this book—namely, the 
particular nature of these gauge theories. 

One of the relevant forces—electromagnetism—has been well understood in its classical 
guise for many years. Over a century ago, Faraday, Maxwell, and others developed the theory 
of electromagnetic interactions culminating in Maxwell’s paper of 1864 (Maxwell 1864). 
Today Maxwell’s theory still stands—unlike Newton’s ‘classical mechanics’ which was shown 
by Einstein to require modifications at relativistic speeds, approaching the speed of light. 
Moreover, Maxwell’s electromagnetism, when suitably married with quantum mechanics, 
gives us quantum electrodynamics or QED. We shall see in chapter 10 that this theory is in 
truly remarkable agreement with experiment. As we have already indicated, the theories of 
the weak and strong forces included in the SM are generalizations of QED, and promise to be 
as successful as that theory. The simplest of the three, QED, is therefore our paradigmatic 
theory. 

From today’s perspective, the crucial thing about electromagnetism is that it is a theory 
in which the dynamics (i.e. the behaviour of the forces) is intimately related to a symmetry 
principle. In the everyday world, a symmetry operation is something that can be done to an 
object that leaves the object looking the same after the operation as before. By extension, 
we may consider mathematical operations—or ‘transformations’—applied to the objects in 
our theory such that the physical laws look the same after the operations as they did before. 
Such transformations are usually called invariances of the laws. Familiar examples are, for 
instance, the translation and rotation invariance of all fundamental laws: Newton’s laws of 
motion remain valid whether or not we translate or rotate a system of interacting particles. 
But of course—precisely because they do apply to all laws, classical or quantum—these two 
invariances have no special connection with any particular force law. Instead, they constrain 
the form of the allowed laws to a considerable extent, but by no means uniquely determine 
them. Nevertheless, this line of argument leads one to speculate whether it might in fact be 
possible to impose further types of symmetry constraints so that the forms of the force laws 
are essentially determined. This would then be one possible answer to the question: why 
are the force laws the way they are? (Ultimately of course this only replaces one question 
by another!) 

In this chapter, we shall discuss electromagnetism from this point of view. This is not 
the historical route to the theory, but it is the one which generalizes to the other two 
interactions. This is why we believe it important to present the central ideas of this approach 
in the familiar context of electromagnetism at this early stage. 

A distinction that is vital to the understanding of all these interactions is that between 
a global invariance and a local invariance. In a global invariance, the same transformation 
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is carried out at all space-time points: it has an ‘everywhere simultaneously’ character. In 
a local invariance, different transformations are carried out at different individual space— 
time points. In general, as we shall see, a theory that is globally invariant will not be 
invariant under locally varying transformations. However, by introducing new force fields 
that interact with the original particles in the theory in a specific way, and which also 
transform in a particular way under the local transformations, a sort of local invariance can 
be restored. We will see all these things more clearly when we go into more detail, but the 
important conceptual point to be grasped is this: one may view these special force fields 
and their interactions as existing in order to permit certain local invariances to be true. The 
particular local invariance relevant to electromagnetism is the well-known gauge invariance 
of Maxwell’s equations: in the quantum form of the theory this property is directly related 
to an invariance under local phase transformations of the quantum fields. A generalized form 
of this phase invariance also underlies the theories of the weak and strong interactions. For 
this reason, they are all known as ‘gauge theories’. 

A full understanding of gauge invariance in electrodynamics can only be reached via the 
formalism of quantum field theory, which is not easy to master—and the theory of quantum 
gauge fields is particularly tricky, as we shall see in chapter 7. Nevertheless, many of the 
crucial ideas can be perfectly adequately discussed within the more familiar framework of 
ordinary quantum mechanics, rather than quantum field theory, treating electromagnetism 
as a purely classical field. This is the programme followed in the rest of part 1 of this volume. 
In the present chapter, we shall discuss these ideas in the context of non-relativistic quantum 
mechanics. In the following two chapters, we shall explore the generalization to relativistic 
quantum mechanics, for particles of spin-0 (via the Klein—Gordon equation) and spin-4 (via 
the Dirac equation). While containing substantial physics in their own right, these chapters 
constitute essential groundwork for the quantum field treatment in parts 2—4. 


2.2 The Maxwell equations: current conservation 


Question: Would you distinguish local conservation laws from global conservation laws. 
Feynman: If a cat were to disappear in Pasadena and at the same time appear in Erice, 
that would be an example of global conservation of cats. This is not the way cats are 
conserved. Cats or charge or baryons are conserved in a much more continuous way. If 
any of these quantities begin to disappear in a region, then they begin to appear in a 
neighbouring region. Consequently, we can identify the flow of charge out of a region with 
the disappearance of charge inside the region. This identification of the divergence of a 
flux with the time rate of change of a charge density is called a local conservation law. A 
local conservation law implies that the total charge is conserved globally, but the reverse 
does not hold. However, relativistically it is clear that non-local global conservation laws 
cannot exist, since to a moving observer the cat will appear in Erice before it disappears 
in Pasadena. 


[From the question-and-answer session following a lecture by R.P.Feynman at the 1964 
International School of Physics “Ettore Majorana” (Feynman 1965b)]. 

We begin by considering the basic laws of classical electromagnetism, the Maxwell equa- 
tions. We use a system of units (Heaviside—Lorentz) which is convenient in particle physics 
(see appendix C). Before Maxwell’s work, these laws were 


V-E = pm (Gauss’ law) (2.1) 
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VxXE = a (Faraday—Lenz laws) (2.2) 
V-B = 0 (no magnetic charges) (2.3) 

and, for steady currents, 
VX Be=jem (Ampére’s law). (2.4) 


Here pem is the charge density and Jem is the current density; these densities act as ‘sources’ 
for the E and B fields. Maxwell noticed that taking the divergence of this last equation 
leads to conflict with the continuity equation for electric charge 


Pem š 
—— +V. =0. 2.5 
a tV dem (2.5) 
Since 
V- (VY x B)=0 (2.6) 
from (2.4) there follows the result 
V. Jem =0. (2.7) 


This can only be true in situations where the charge density is constant in time. For the 
general case, Maxwell modified Ampère’s law to read 


OE 
Ot 
which is now consistent with (2.5). Equations (2.1)—(2.3), together with (2.8), constitute 
Maxwell’s equations in free space (apart from the sources). 

It is worth spending a moment on the vitally important continuity equation (2.5)—note 
the Feynman quotation at the start of this section. Let us integrate this equation over any 
arbitrary volume Q, and write the result as 


È S Pend == f Viona (2.9) 
at Jo A 


Equation (2.9) states that the rate of decrease of charge in any arbitrary volume Q is 
due precisely and only to the flux of current out of its surface; that is, no net charge can 
be created or destroyed in Q. Since Q can be made as small as we please, this means 
that electric charge must be locally conserved: a process in which charge is created at one 
point and destroyed at a distant one is not allowed, despite the fact that it conserves 
the charge overall or ‘globally’. The ultimate reason for this is that the global form of 
charge conservation would necessitate the instantaneous propagation of signals (such as 
‘now, create a positron over there’), and this conflicts with special relativity—a theory 
which, historically, flowered from the soil of electrodynamics. The extra term introduced by 
Maxwell—the ‘electric displacement current’—owes its place in the dynamical equations to 
a local conservation requirement. 

We remark at this point that we have just introduced another local/global distinction, 
similar to that discussed earlier in connection with invariances. In this case the distinction 
applies to a conservation law, but since invariances are related to conservation laws in both 
classical and quantum mechanics, we should perhaps not be too surprised by this. However, 
as with invariances, conservation laws—such as charge conservation in electromagnetism— 
play a central role in gauge theories in that they are closely related to the dynamics. The 
point is simply illustrated by asking how we could measure the charge of a newly created 
subatomic particle X. There are two conceptually different ways: 


Vx B=jont (2.8) 
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(i) We could arrange for X to be created in a reaction such as 
A+B>C+D+X 


where the charges of A, B, C, and D are already known. In this case we can use 
charge conservation to determine the charge of X. 


(ii) We could see how particle X responded to known electromagnetic fields. This uses 
dynamics to determine the charge of X. 


Either way gives the same answer: It is the conserved charge which determines the 
particle’s response to the field. By contrast, there are several other conservation laws that 
seem to hold in particle physics, such as lepton number and baryon number, that apparently 
have no dynamical counterpart (cf the remarks at the end of section 1.3.6). To determine 
the baryon number of a newly produced particle, we have to use B conservation and tot 
up the total baryon number on either side of the reaction. As far as we know there is no 
baryonic force field. 

Thus gauge theories are characterized by a close interrelation between three conceptual 
elements: symmetries, conservation laws, and dynamics. In fact, it is now widely believed 
that the only exact quantum number conservation laws are those which have an associated 
gauge theory force field—see comment (i) in section 2.6. Thus one might suspect that 
baryon number is not absolutely conserved—as is indeed the case in proposed unified gauge 
theories of the strong, weak, and electromagnetic interactions. In this discussion we have 
briefly touched on the connection between two pairs of these three elements: symmetries © 
dynamics; and conservation laws + dynamics. The precise way in which the remaining link 
is made—between the symmetry of electromagnetic gauge invariance and the conservation 
law of charge—is more technical. We will discuss this connection with the help of simple 
ideas from quantum field theory in chapter 7, section 7.4. For the present, we continue with 
our study of the Maxwell equations and, in particular, of the gauge invariance they exhibit. 


2.3 The Maxwell equations: Lorentz covariance and gauge 
invariance 


In classical electromagnetism, and especially in quantum mechanics, it is convenient to 
introduce the vector potential A,,(x) in place of the fields E and B. We write: 


B=VxA (2.10) 
E=-vv-2A (2.11) 


which defines the 3-vector potential A and the scalar potential V. With these definitions, 
equations (2.2) and (2.3) are then automatically satisfied. 

The origin of gauge invariance in classical electromagnetism lies in the fact that the 
potentials A and V are not unique for given physical fields E and B. The transformations 
that A and V may undergo while preserving E and B (and hence the Maxwell equations) 
unchanged are called gauge transformations, and the associated invariance of the Maxwell 
equations is called gauge invariance. 

What are these transformations? Clearly A can be changed by 


A> A’ =A+4+Vx (2.12) 
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where x is an arbitrary function, with no change in B since V x Vf = 0, for any scalar 
function f. To preserve E, V must then change simultaneously by 
Ox 
VoV=V-—. 2.13 
At (2.13) 
These transformations can be combined into a single compact equation by introducing the 
4-vector potential!: 
A” = (V, A) (2.14) 


and noting (from problem 2.1) that the differential operators (3/ðt, —V ) form the compo- 
nents of a 4-vector operator 0“. A gauge transformation is then specified by 


At > Al! = A" — By, (2.15) 


The Maxwell equations can also be written in a manifestly Lorentz covariant form (see 
appendix D) using the 4-current j#,„ given by 


Come (2.16) 
in terms of which the continuity equation takes the form (problem 2.1): 
Onjem = 0- (2.17) 
The Maxwell equations (2.1) and (2.8) then become (problem 2.2): 
OFM = 5%, (2.18) 
where we have defined the field strength tensor: 
PHY = OF AY — OV A". (2.19) 
Under the gauge transformation 
A" — A" = A" — Ohy (2.20) 


FY” remains unchanged: 
P sas PHSF (2.21) 


so F#” is gauge invariant and so, therefore, are the Maxwell equations in the form (2.18). 
The ‘Lorentz-covariant and gauge-invariant field equations’ satisfied by A” then follow from 
equations (2.18) and (2.19): 


A” — 8 (8 A") = 7%. (2.22) 


Since gauge transformations turn out to be of central importance in the quantum theory 
of electromagnetism, it would be nice to have some insight into why Maxwell ’s equations 
are gauge invariant. The all-important ‘fourth’ equation (2.8) was inferred by Maxwell from 
local charge conservation, as expressed by the continuity equation 


Oy jt, = 0. (2.23) 


The field equation 
Dp E = ja (2.24) 


1See appendix D for relativistic notation and for an explanation of the very important concept of co- 
variance, which we are about to invoke in the context of Lorentz transformations, and will use again in the 
next section in the context of gauge transformations; we shall also use it in other contexts in later chapters. 


38 Electromagnetism as a Gauge Theory 


then of course automatically embodies (2.23). The mathematical reason it does so is that 
fF” is a four-dimensional kind of ‘curl 


FH” = Oh AY — 8” AH (2.25) 
which (as we have seen in (2.21)) is unchanged by a gauge transformation 
A! — A" = A" — Oy, (2.26) 


Hence there is the suggestion that the gauge invariance is related in some way to 
charge conservation. However, the connection is not so simple. Wigner (1949) has given 
a simple argument to show that the principle that no physical quantity can depend on 
the absolute value of the electrostatic potential, when combined with energy conservation, 
implies the conservation of charge. Wigner’s argument relates charge (and energy) conser- 
vation to an invariance under transformation of the electrostatic potential by a constant; 
charge conservation alone does not seem to require the more general space-time-dependent 
transformation of gauge invariance. 

Changing the value of the electrostatic potential by a constant amount is an example of 
what we have called a global transformation (since the change in the potential is the same 
everywhere). Invariance under this global transformation is related to a conservation law, 
that of charge. But this global invariance is not sufficient to generate the full Maxwellian 
dynamics. However, as remarked by ’t Hooft (1980), one can regard equations (2.12) and 
(2.13) as expressing the fact that the local change in the electrostatic potential V (the 
Ox /Ot term in (2.13)) can be compensated—in the sense of leaving the Maxwell equations 
unchanged—by a corresponding local change in the magnetic vector potential A. Thus by 
including magnetic effects, the global invariance under a change of V by a constant can be 
extended to a local invariance (which is a much more restrictive condition to satisfy). Hence 
there is a beginning of a suggestion that one might almost ‘derive’ the complete Maxwell 
equations, which unify electricity and magnetism, from the requirement that the theory 
be expressed in terms of potentials in such a way as to be invariant under local (gauge) 
transformations on those potentials. Certainly special relativity must play a role too and 
this also links electricity and magnetism, via the magnetic effects of charges as seen by an 
observer moving relative to them. If a 4-vector potential A” is postulated, and it is then 
demanded that the theory involve it only in a way which is insensitive to local changes of 
the form (2.15), one is led naturally to the idea that the physical fields enter only via the 
quantity F””, which is invariant under (2.15). From this, one might conjecture the field 
equation on grounds of Lorentz covariance. 

It goes without saying that this is certainly not a ‘proof’ or ‘derivation’ of the Maxwell 
equations. Nevertheless, the idea that dynamics (in this case, the complete interconnection 
of electric and magnetic effects) may be intimately related to a local invariance requirement 
(in this case, electromagnetic gauge invariance) turns out to be a fruitful one. As indicated 
in section 2.1, it is generally the case that, when a certain global invariance is generalized to 
a local one, the existence of a new ‘compensating’ field is entailed, interacting in a specified 
way. The first example of dynamical theory ‘derived’ from a local invariance requirement 
seems to be the theory of Yang and Mills (1954) (see also Shaw 1955). Their work was ex- 
tended by Utiyama (1956), who developed a general formalism for such compensating fields. 
As we have said, these types of dynamical theories, based on local invariance principles, are 
called gauge theories. 

It is a remarkable fact that the interactions in the SM of particle physics are of precisely 
this type. We have briefly discussed the Maxwell equations in this light, and we will continue 
with (quantum) electrodynamics in the following two sections. The two other fundamen- 
tal interactions—the strong interaction between quarks and the weak interaction between 
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quarks and leptons—also seem to be described by gauge theories (of essentially the Yang- 
Mills type), as we shall see in detail in the second volume of this book. A fourth example, 
but one which we shall not pursue in this book, is that of general relativity (the theory 
of gravitational interactions). Utiyama (1956) showed that this theory could be arrived at 
by generalizing the global (space-time independent) coordinate transformations of special 
relativity to local ones; as with electromagnetism, the more restrictive local invariance re- 
quirements entailed the existence of a new field—the gravitational one—with an (almost) 
prescribed form of interaction. Unfortunately, despite this ‘gauge’ property, no consistent 
quantum field theory of general relativity is known. 

In order to proceed further, we must now discuss how such (gauge) ideas are incorporated 
into quantum mechanics. 


2.4 Gauge invariance (and covariance) in quantum mechanics 


The Lorentz force law for a non-relativistic particle of charge q moving with velocity v 
under the influence of both electric and magnetic fields is 


F=qE+qu~xB. (2.27) 


It may be derived, via Hamilton’s equations, from the classical Hamiltonian? 
y , q 


1 2 
H= P — qA) +qvV. (2.28) 


The Schrödinger equation for such a particle in an electromagnetic field is 


„3Y(æ, t) 


(CIV - 04)? tav ) vt) =O (2.29) 


which is obtained from the classical Hamiltonian by the usual prescription, p — —iV, for 
Schrédinger’s wave mechanics (A = 1). Note the appearance of the operator combinations 


D= V —iqA 


(2.30) 
D? = 8/ðt + iqV 


in place of V and 0/0t, in going from the free-particle Schrödinger equation to the electro- 
magnetic field case. 

The solution (æ, t) of the Schrödinger equation (2.29) describes completely the state 
of the particle moving under the influence of the potentials V, A. However, these potentials 
are not unique, as we have already seen: they can be changed by a gauge transformation 


AAM = A+Vx (2.31) 
V>V' = V-ðy/ðt (2.32) 


and the Maxwell equations for the fields E and B will remain the same. This immediately 
raises a serious question: if we carry out such a change of potentials in equation (2.29), 
will the solution of the resulting equation describe the same physics as the solution of 
equation (2.29)? If it does, we shall be able to assume the validity of Maxwell ’s theory for 


2We set i = c = 1 throughout (see appendix B). 
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the quantum world; if not, some modification will be necessary, since the gauge symmetry 
possessed by the Maxwell equations will be violated in the quantum theory. 

The answer to the question just posed is evidently negative, since it is clear that the same 
‘w’ cannot possibly satisfy both (2.29) and the analogous equation with (V, A) replaced by 
(V', A’). Unlike Maxwell’s equations, the Schrödinger equation is not gauge invariant. But 
we must remember that the wavefunction wy is not a directly observable quantity, as the 
electromagnetic fields E and B are. Perhaps w does not need to remain unchanged (invari- 
ant) when the potentials are changed by a gauge transformation. In fact, in order to have 
any chance of ‘describing the same physics’ in terms of the gauge-transformed potentials, 
we will have to allow w to change as well. This is a crucial point: for quantum mechanics to 
be consistent with Maxwell’s equations, it is necessary for the gauge transformations (2.31) 
and (2.32) of the Maxwell potentials to be accompanied also by a transformation of the 
quantum-mechanical wavefunction, ~ —> w’, where y’ satisfies the equation 


(av — qA')? + av) yp (æ, t) = i (2.33) 
2m 

Note that the form of (2.33) is exactly the same as the form of (2.29)—it is this that 
will effectively ensure that both ‘describe the same physics’. Readers of appendix D will 
expect to be told that—if we can find such a y’—we may then assert that (2.29) is gauge 
covariant, meaning that it maintains the same form under a gauge transformation. (The 
transformations relevant to this use of ‘covariance’ are gauge transformations.) 

Since we know the relations (2.31) and (2.32) between A, V and A’, V’, we can actually 
find what ~)’(a,t) must be in order that equation (2.33) be consistent with (2.29). We shall 
state the answer and then verify it; then we shall discuss the physical interpretation. The 
required 7’ (a, t) is 

Ui (w,t) = expligx(@, Dua, t) (2.34) 
where y is the same space-time-dependent function as appears in equations (2.31) and 
(2.32). To verify this we consider 


(“iV -qA = [-iV -qA - q(Vx)]lexpliax)¥] 
= 4q(Vx) exp(igx)? + exp(igx) - (iV 4) 
+ exp(iqx) : (~4AY) — q(V x) expligx)Y- (2.35) 
The first and the last terms cancel leaving the result: 
(—iV — qA’)w' = exp(iqx) « (-iV — gA) (2.36) 
which may be written using equation (2.30) as: 
(-iD'y") = expliqx) - (-iD¥). (2.37) 


Thus, although the space-time-dependent phase factor feels the action of the gradient op- 
erator V, it ‘passes through’ the combined operator D’ and converts it into D. In fact, by 
comparing the equations (2.34) and (2.37), we see that D'y’ bears to Dy exactly the same 
relation as ~’ bears to w. In just the same way, we find (cf equation (2.30)) 

(iD!) = expligx) - (Dw) (2.38) 
where we have used equation (2.32) for V’. Once again, Dw’ is simply related to D°. 
Repeating the operation which led to equation (2.37), we find 

1 sT  _ : 1 : 2 
zy CP yy’ = exp(igx) zy CDP) Y 
= exp(igxy) -iD°w (using equation (2.29)) 
iD’ (using equation (2.30)). (2.39) 
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Equation (2.39) is just (2.33) written in the D notation of equation (2.30), so we have verified 
that (2.34) is the correct relationship between w’ and y to ensure consistency between 
equations (2.29) and (2.33). Precisely this consistency is summarized by the statement that 
(2.29) is gauge covariant. 

Do w and wy’ describe the same physics, in fact? The answer is yes, but it is not quite 
trivial. It is certainly obvious that the probability densities ||? and |7"|? are equal, since in 
fact w and w’ in equation (2.34) are related by a phase transformation. However, we can be 
interested in other observables involving the derivative operators V or 0/0t—for example, 
the current, which is essentially Y* (Vw) — (Vw)*v. It is easy to check that this current is 
not invariant under (2.34), because the phase y(a,t) is a-dependent. But equations (2.37) 
and (2.38) show us what we must do to construct gauge-invariant currents: namely, we 
must replace V by D (and in general also 0/0t by D?) since then: 


yY” (D'y') = * exp(—igx) - exp(igx) - (Dy) = Y* Dy (2.40) 


for example. Thus the identity of the physics described by % and 7’ is indeed ensured. 
Note, incidentally, that the equality between the first and last terms in (2.40) is indeed a 
statement of (gauge) invariance. 

We summarize these important considerations by the statement that the gauge invari- 
ance of Maxwell equations re-emerges as a covariance in quantum mechanics provided we 
make the combined transformation 


A> A'=A+Vyx 
V > V' =V — ðx/ðt (2.41) 
Y > v' = exp(iqx)y 


on the potential and on the wavefunction. 

The Schrödinger equation is non-relativistic, but the Maxwell equations are of course 
fully relativistic. One might therefore suspect that the prescriptions discovered here are 
actually true relativistically as well, and this is indeed the case. We shall introduce the 
spin-0 and spin-4 relativistic equations in chapter 3. For the present, we note that (2.30) 
can be written in manifestly Lorentz covariant form as 


DH = OF + ig AH (2.42) 


in terms of which (2.37) and (2.38) become 


—iD'"y' = exp(iqx) - (-iD" wv). (2.43) 
It follows that any equation involving the operator O” can be made gauge invariant under 
the combined transformation 
AX — A= AF — Oly 
p > wv =expliqx)) 
if ð} is replaced by D”. In fact, we seem to have a very simple prescription for obtain- 


ing the wave equation for a particle in the presence of an electromagnetic field from the 
corresponding free particle wave equation: make the replacement 


OH > DH = OF + ig AX. (2.44) 
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In the following section, this will be seen to be the basis of the so-called gauge principle 
whereby, in accordance with the idea advanced in the previous sections, the form of the 
interaction is determined by the insistence on (local) gauge invariance. 

One final remark: this new kind of derivative 


D” = Ə! + igA" (2.45) 


turns out to be of fundamental importance—it will be the operator which generalizes from 
the (Abelian) phase symmetry of QED (see comment (iii) of section 2.6) to the (non- 
Abelian) phase symmetry of our weak and strong interaction theories. It is called the gauge 
covariant derivative, the term being usually shortened to ‘covariant derivative’ in the present 
context. The geometrical significance of this term will be explained in volume 2. 


2.5 The argument reversed: the gauge principle 


In the preceding section, we took it as known that the Schrödinger equation, for example, 
for a charged particle in an electromagnetic field, has the form 


1 ; : 
T — qA}? + qV | Y = idw/dt. (2.46) 


We then checked its gauge invariance under the combined transformation 


A>A' = A+Vx 
V>V' = V—ax/at (2.47) 
yoy’ = exp(igxy)y). 


We now want to reverse the argument. We shall start by demanding that our theory is 
invariant under the space-time-dependent phase transformation 


(a, t) => Y (x,t) = expliqx (x, t)]Y(z, t). (2.48) 


We shall demonstrate that such a phase invariance is not possible for a free theory, but 
rather requires an interacting theory involving a (4-vector) field whose interactions with 
the charged particle are precisely determined, and which undergoes the transformation 


A>A = A+Vy (2.49) 
VoV' = V-—dy/dt (2.50) 


when yw — y’. The demand of this type of phase invariance will have then dictated the form 
of the interaction—this is the basis of the gauge principle. 

Before proceeding we note that the resulting equation—which will of course turn out 
to be (2.29)—will not strictly speaking be invariant under (2.48), but rather covariant 
(in the gauge sense), as we saw in the preceding section. Nevertheless, we shall in this 
section sometimes continue (slightly loosely) to speak of ‘local phase invariance’. When we 
come to implement these ideas in quantum field theory in chapter 7 (section 7.4), using 
the Lagrangian formalism, we shall see that the relevant Lagrangians are indeed invariant 
under (2.48). 
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We therefore focus attention on the phase of the wavefunction. The absolute phase 
of a wavefunction in quantum mechanics cannot be measured; only relative phases are 
measurable, via some sort of interference experiment. A simple example is provided by the 
diffraction of particles by a two-slit system. Downstream from the slits, the wavefunction is 
a coherent superposition of two components, one originating from each slit: Symbolically, 


Y = pı + Y2. (2.51) 


The probability distribution ||? will then involve, in addition to the separate intensities 
|u|? and |we|?, the interference term 


2 Re(yi 2) = 2|y1||h2| cos 6 (2.52) 


where 6 (= 6; — 62) is the phase difference between components pı and w2. The familiar 
pattern of alternating intensity maxima and minima is then attributed to variation in the 
phase difference 6. Where the components are in phase, the interference is constructive and 
||? has a maximum; where they are out of phase, it is destructive and |w|? has a minimum. 
It is clear that if the individual phases 6, and 62 are each shifted by the same amount, there 
will be no observable consequences since only the phase difference 6 enters. 

The situation in which the wavefunction can be changed in a certain way without leading 
to any observable effects is precisely what is entailed by a symmetry or invariance principle 
in quantum mechanics. In the case under discussion, the invariance is that of a constant 
overall change in phase. In performing calculations, it is necessary to make some definite 
choice of phase, that is, to adopt a ‘phase convention’. The invariance principle guarantees 
that any such choice, or convention, is equivalent to any other. 

Invariance under a constant change in phase is an example of a global invariance accord- 
ing to the terminology introduced in the previous section. We make this point quite explicit 
by writing out the transformation as 


boy sem 


& = constant 


global phase transformation. (2.53) 


That a in (2.53) is a constant, the same for all space-time points, expresses the fact that 
once a phase convention (choice of a) has been made at one space-time point, the same 
must be adopted at all other points. Thus in the two-slit experiment we are not free to make 
a local change of phase: for example, as discussed by ’t Hooft (1980), inserting a half-wave 
plate behind just one of the slits will certainly have observable consequences. 

There is a sense in which this may seem an unnatural state of affairs. Once a phase 
convention has been adopted at one space-time point, the same convention must be adopted 
at all other ones: the half-wave plate must extend instantaneously across all of space, or 
not at all. Following this line of thought, one might then be led to ‘explore the possibility’ 
of requiring invariance under local phase transformations: that is, independent choices of 
phase convention at each space-time point. By itself, the foregoing is not a compelling 
motivation for such step. However, as we pointed out in section 2.3, such a move from a 
global to a local invariance is apparently of crucial significance in classical electromagnetism 
and general relativity, and seems now to provide the key to an understanding of the other 
interactions in the SM. Let us see, then, where the demand of ‘local phase invariance’ 


w(x, t) > Y' (x,t) = explia(a, t)]y(a, t) local phase transformation (2.54) 


leads us. 
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There is immediately a problem: this is not an invariance of the free-particle Schrodinger 
equation or of any free-particle relativistic wave equation! For example, if the original wave- 
function (x,t) satisfied the free-particle Schrödinger equation 

* (i\b(a, t) = iva, t)/Ot (2.55) 

2m 
then the wavefunction wv’, given by the local phase transformation, will not, since both 
V and 0/0t now act on a(z,t) in the phase factor. Thus local phase invariance is not an 
invariance of the free-particle wave equation. If we wish to satisfy the demands of local phase 
invariance, we are obliged to modify the free-particle Schrödinger equation into something 
for which there is a local phase invariance—or rather, more accurately, a corresponding 
covariance. But this modified equation will no longer describe a free particle. In other 
words, the freedom to alter the phase of a charged particle’s wavefunction locally is only 
possible if some kind of force field is introduced in which the particle moves. In more physical 
terms, the covariance will now be manifested in the inability to distinguish observationally 
between the effect of making a local change in phase convention and the effect of some new 
field in which the particle moves. 

What kind of field will this be? In fact, we know immediately what the answer is, since 
the local phase transformation 


ww! = explia(a, t)|w (2.56) 


with a = qx is just the phase transformation associated with electromagnetic gauge invari- 
ance! Thus we must modify the Schrödinger equation 


Civ) = ið/ðt (2.57) 


to 
5 (AV — qA)*~ = (i0/dt — qV y (2.58) 


and satisfy the local phase invariance 

Yp — y = explia(a, t)|w (2.59) 
by demanding that A and V transform by 

A> A'=A+q'Va 


2.60 
VioV'=V—-q'0a/dt oe 


when Y — 7’. The modified wave equation is of course precisely the Schrödinger equation 
describing the interaction of the charged particle with the electromagnetic field described 
by A and V. 

In a Lorentz covariant treatment, A and V will be regarded as parts of a 4-vector 
At, just as —V and 0/dt are parts of ð” (see problem 2.1). Thus the presence of the 
vector field A“, interacting in a ‘universal’ prescribed way with any particle of charge q, 
is dictated by local phase invariance. A vector field such as A“, introduced to guarantee 
local phase invariance, is called a ‘gauge field’. The principle that the interaction should be 
so dictated by the phase (or gauge) invariance is called the gauge principle; it allows us to 
write down the wave equation for the interaction directly from the free particle equation via 
the replacement (2.44). As before, the method clearly generalizes to the four-dimensional 
case. 


3 Actually the electromagnetic interaction is uniquely specified by this procedure only for particles of 
spin-O or 4. The spin-1 case will be discussed in volume 2. 
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2.6 Comments on the gauge principle in electromagnetism 
Comment (i) 


A properly sceptical reader may have detected an important sleight of hand in the previous 
discussion. Where exactly did the electromagnetic charge appear from? The trouble with 
our argument as so far presented is that we could have defined fields A and V so that they 
coupled equally to all particles—instead we smuggled in a factor q. 

Actually, we can do a bit better than this. We can use the fact that the electromagnetic 
charge is absolutely conserved to claim that there can be no quantum mechanical interfer- 
ence between states of different charge q. Hence different phase changes are allowed within 
each ‘sector’ of definite q: 

y = expliqx)Y (2.61) 
let us say. When this becomes a local transformation, x > x(æ, t), we shall need to cancel 
a term qV x, which will imply the presence of a ‘—qA’ term, as required. Note that such 
an argument is only possible for an absolutely conserved quantum number q—otherwise 
we cannot split up the states of the system into non-communicating sectors specified by 
different values of g. Reversing this line of reasoning, a conservation law such as baryon 
number conservation, with no related gauge field, would therefore now be suspected of not 
being absolutely conserved. 

We still have not tied down why q is the electromagnetic charge and not some other 
absolutely conserved quantum number. A proper discussion of the reasons for identifying 
A with the electromagnetic potential and q with the particle’s charge will be given in 
chapter 6 with the help of quantum field theory. 


Comment (ii) 


Accepting these identifications, we note that the form of the interaction contains but one 
parameter, the electromagnetic charge q of the particle in question. It is the same whatever 
the type of particle with charge q, whether it be lepton, hadron, nucleus, ion, atom, etc. 
Precisely this type of ‘universality’ is present in the weak couplings of quarks and leptons, as 
we shall see in volume 2. This strongly suggests that some form of gauge principle must be 
at work in generating weak interactions as well. The associated symmetry or conservation 
law is, however, of a very subtle kind. Incidentally, although all particles of a given charge 
q interact electromagnetically in a universal way, there is nothing at all in the preceding ar- 
gument to indicate why, in nature, the charges of observed particles are all integer multiples 
of one basic charge. 


Comment (iii) 


Returning to comment (i), we may wish that we did not have to introduce the absolute 
conservation of charge as a separate axiom. As remarked earlier, at the end of section 2.2, 
we should like to relate that conservation law to the symmetry involved, namely invariance 
under (2.54). It is worth looking at the nature of this symmetry in a little more detail. It is 
not asymmetry which—as in the case of translation and rotation invariances for instance— 
involves changes in the space-time coordinates x and t. Instead, it operates on the real and 
imaginary parts of the wavefunction. Let us write 


b= dr tiv. (2.62) 
Then 
Y = eb = oR t iYi (2.63) 
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can be written as 


Wp = (cos ayr — (sin ayr 
Uf = (sina)yr + cosa)yr 


from which we can see that it is indeed a kind of ‘rotation’, but in the Yr—yı plane, 
whose ‘coordinates’ are the real and imaginary parts of the wavefunction. We call this plane 
an internal space and the associated symmetry an internal symmetry. Thus our phase 
invariance can be looked upon as a kind of internal space rotational invariance. 

We can imagine doing two successive such transformations 


(2.64) 


yoyo" (2.65) 
where 
ap” = ey! (2.66) 
and so 
y" — el(o+8)y, = ea (2.67) 


with ô = a + 8. This is a transformation of the same form as the original one. The set of all 
such transformations forms what mathematicians call a group, in this case U(1), meaning 
the group of all unitary one-dimensional matrices. A unitary matrix U is one such that 


UU! = UU =1 (2.68) 


where 1 is the identity matrix and denotes the Hermitian conjugate. A one-dimensional 
matrix is of course a single number—in this case a complex number. Condition (2.68) limits 
this to being a simple phase: the set of phase factors of the form e!“, where a is any real 
number, form the elements of a U(1) group. These are just the factors that enter into our 
gauge (or phase) transformations for wavefunctions. Thus we say that the electromagnetic 
gauge group is U(1). We must remember, however, that it is a local U(1), meaning (cf 
(2.54)) that the phase parameters a, 3,... depend on the space-time point x. 

The transformations of the U(1) group have the simple property that it does not matter 
in what order they are performed. Referring to (2.65)—(2.67), we would have got the same 
final answer if we had done the @ ‘rotation’ first and then the a one, instead of the other 
way around. This is because, of course, 


exp(ia) : exp(i9) = exp[i(a + @)] = exp(i9) - exp(ia). (2.69) 


This property remains true even in the ‘local’ case when a and 8 depend on x. Mathemati- 
cians call U(1) an Abelian group: different transformations commute. We shall see later (in 
volume 2) that the ‘internal’ symmetry spaces relevant to the strong and weak gauge invari- 
ances are not so simple. The ‘rotations’ in these cases are more like full three-dimensional 
rotations of real space, rather than the two-dimensional rotation of (2.64). We know that, in 
general, such real-space rotations do not commute, and the same will be true of the strong 
and weak rotations. Their gauge groups are called non-Abelian. 

Once again, we shall have to wait until chapter 7 before understanding how the symmetry 
represented by (2.63) is really related to the conservation law of charge. 


Comment (iv) 


The attentive reader may have picked up one further loose end. The vector potential A is 
related to the magnetic field B by 


B=VxA. (2.70) 
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Thus if A has the special form 
A=Vf (2.71) 


B will vanish. The question we must answer, therefore, is: how do we know that the A 
field introduced by our gauge principle is not of the form (2.71), leading to a trivial theory 
(B = 0)? The answer to this question will lead us on a very worthwhile detour. 

The Schrödinger equation with V f as the vector potential is 


iV —gVF)y = By. (2.72) 


We can write the formal solution to this equation as 


vsen(ia f vra) oeo (2.73) 


which may be checked by using the fact that 


al a 
=l FO dt = f(a). (2.74) 


The notation w(f = 0) means just the free-particle solution with f = 0; the line integral is 
taken along an arbitrary path ending in the point a. But we have 


OF a. OF = 
= gr tt gy + gp REVS (2.75) 


df 
Hence the integral can be done trivially and the solution becomes 


Y = explig(f(@) — f(—co))] - ¥(f = 0). (2.76) 


We say that the phase factor introduced by the (in reality, field-free) vector potential A = 
V f is integrable: the effect of this particular A is merely to multiply the free-particle solution 
by an a-dependent phase (apart from a trivial constant phase). Since this A should give 
no real electromagnetic effect, we must hope that such a change in the wavefunction is also 
somehow harmless. Indeed Dirac showed (Dirac 1981, pp 92-3) that such a phase factor 
corresponds merely to a redefinition of the momentum operator p. The essential point is 
that (in one dimension, say) f is defined ultimately by the commutator (h = 1) 


lê, p] =i. (2.77) 
Certainly the familiar choice 
o 
j= -i-— 2.78 
p=- (2.78) 


satisfies this commutation relation. But we can also add any function of x to p, and this 
modified ô will be still satisfactory since x commutes with any function of x. More detailed 
considerations by Dirac showed that this arbitrary function must actually have the form 
OF /Ox, where F is arbitrary. Thus 


O OF 
A = —i— + — 2.79 
P ðr Ox ( ) 
is an acceptable momentum operator. Consider then the quantum mechanics defined by 
the wavefunction (f = 0) and the momentum operator p = —i0/Ox. Under the unitary 


transformation (cf (2.76)) 
Y =0) > FF = 0) (2.80) 
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q 


FIGURE 2.1 
Two paths Cı and C2 (in two dimensions for simplicity) from —oo to the point æ. 


p will be transformed to 
p — df) femias), (2.81) 


But the right-hand side of this equation is just p— qð f /Ox (problem 2.3), which is an equally 
acceptable momentum operator, identifying qf with the F of Dirac. Thus the case A = V f 
is indeed equivalent to the field-free case. 

What of the physically interesting case in which A is not of the form V f? The equation 
is now 


1 . 2 
L (iv - gA)*y = Ey (2.82) 
to which the solution is 
x 
Y = exp (uf A. a) -Ww(A =0). (2.83) 


The line integral can now not be done so trivially: one says that the A-field has produced 
a non-integrable phase factor. There is more to this terminology than the mere question of 
whether the integral is easy to do. The crucial point is that the integral now depends on the 
path followed in reaching the point x, whereas the integrable phase factor in (2.73) depends 
only on the end-points of the integral, not on the path joining them. 

Consider two paths Cı and Cə (figure 2.1) from —oo to the point x. The difference in the 
two line integrals is the integral over a closed curve C, which can be evaluated by Stokes’ 
theorem: 


x x 
A-dl— A-d=¢A-a=[[vxa-as=/f Bas (2.84) 
C1 Co € S S 


where S$ is any surface spanning the curve C. In this form we see that if A = V f, then 
indeed the line integrals over Cı and C2 are equal since V x Vf = 0, but if B = V x A is 
not zero, the difference between the integrals is determined by the enclosed flux of B. 
This analysis turns out to imply the existence of a remarkable phenomenon—the 
Aharonov-Bohm effect, named after its discoverers (Aharonov and Bohm 1959). Suppose 
we go back to our two-slit experiment of section 2.5, only this time we imagine that a long 
thin solenoid is inserted between the slits, so that the components yı and Y2 of the split 
beam pass one on each side of the solenoid (figure 2.2). After passing round the solenoid, 
the beams are recombined, and the resulting interference pattern is observed downstream. 
At any point æ of the pattern, the phase of the yı and p2 components will be modified— 
relative to the B = 0 case—by factors of the form (2.83). These factors depend on the 
respective paths, which are different for the two components w, and Y2. The phase differ- 
ence between these components, which determines the interference pattern, will therefore 
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FIGURE 2.2 
The Aharonov-Bohm effect. 


involve the B-dependent factor (2.84). Thus, even though the field B is essentially totally 
contained within the solenoid, and the beams themselves have passed through B = 0 re- 
gions only, there is nevertheless an observable effect on the pattern provided B 4 0! This 
effect—a shift in the pattern as B varies—was first confirmed experimentally by Cham- 
bers (1960), soon after its prediction by Aharonov and Bohm. It was anticipated in work 
by Ehrenburg and Siday (1949); further references and discussion are contained in Berry 
(1984). 


Comment (v) 


In conclusion, we must emphasize that there is ultimately no compelling logic for the vital 
leap to a local phase invariance from a global one. The latter is, by itself, both necessary 
and sufficient in quantum field theory to guarantee local charge conservation. Nevertheless, 
the gauge principle—deriving interactions from the requirement of local phase invariance— 
provides a satisfying conceptual unification of the interactions present in the SM. In vol- 
ume 2 of this book we shall consider generalizations of the electromagnetic gauge principle. 
It will be important always to bear in mind that any attempt to base theories of non- 
electromagnetic interactions on some kind of gauge principle can only make sense if there is 
an exact symmetry involved. The reason for this will only become clear when we consider 
the renormalizability of QED in chapter 11. 


Problems 


2.1 


a) A Lorentz transformation in the x! direction is given by 
g 


t = (t—v2") 
z” = »(-vt +2") 
x? _ r? x?! = r? 


where y = (1 — v?)~!/? and c = 1. Write down the inverse of this transformation 
(i.e. express (t,x!) in terms of (t’,a'’)), and use the ‘chain rule’ of partial dif- 
ferentiation to show that, under the Lorentz transformation, the two quantities 
(0/0t, —Ə/ðx!) transform in the same way as (t, xt). 
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[The general result is that the four-component quantity (0/dt,—0/dz', 
—0/0x?,-—0/0x?) = (0/0t, -V) transforms in the same way as (t, x1, x?, 2°). 
Four-component quantities transforming this way are said to be ‘contravariant 
4-vectors’, and are written with an upper 4-vector index; thus (0/0t,-V) = 
ð”. Upper indices can be lowered by using the metric tensor g,,,, see 
appendix D, which reverses the sign of the spatial components. Thus 
OH = (0/0t, 0/021, 0/0x2,0/0x3). Similarly the four quantities (0/0t,V) = 
(0/dt, 0/dx1, 0/Ax?, 0/Ax?) transform as (t, —x!, —x?, —x?) and are a ‘covariant 
4-vector’, denoted by 0,,.] 


(b) Check that equation (2.5) can be written as (2.17). 
2.2 How many independent components does the field strength F”” have? Express each 


component in terms of electric and magnetic field components. Hence verify that equa- 
tion (2.18) correctly reproduces both equations (2.1) and (2.8). 


2.3 Verify the result 
o 
elf) peite) — 5 ase 
x 


3 


Relativistic Quantum Mechanics 


It is clear that the non-relativistic Schrödinger equation is quite inadequate to analyse the 
results of experiments at energies far higher than the rest mass energies of the particles 
involved. Besides, the quarks and leptons have spin-3, a degree of freedom absent from the 
Schrodinger wavefunction. We therefore need two generalizations—from non-relativistic to 
relativistic for spin-O particles, and from spin-0 to spin-3. The first step is to the Klein— 
Gordon equation (section 3.1), the second to the Dirac equation (section 3.2). Then after 
some further work on solutions of the Dirac equation (sections 3.3-3.4), we shall consider 
(section 3.5) some simple consequences of including the electromagnetic interaction via the 
gauge principle replacement (2.44). 


3.1 The Klein—Gordon equation 


The non-relativistic Schrödinger equation may be put into correspondence with the non- 
relativistic energy-momentum relation 


E = p?/2m (3.1) 

by means of the operator replacements! 
E — id/dt (3:2) 
p > -iV (3.3) 


these differential operators being understood to act on the Schrödinger wavefunction. 

For a relativistic wave equation we must start with the correct relativistic energy- 
momentum relation. Energy and momentum appear as the ‘time’ and ‘space’ components 
of the momentum 4-vector 


p" = (E, p) (3.4) 
which satisfy the mass-shell condition 


P = pup" = B’ -p =m’, (3.5) 


Since energy and momentum are merely different components of a 4-vector, an attempt to 
base a relativistic theory on the relation 


E = +p + m?) (3.6) 


is unattractive, as well as having obvious difficulties in interpretation for the square root 
operator. Schrödinger, before settling for the less ambitious non-relativistic Schrödinger 


1Recall h = c = 1 throughout (see appendix B). 
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equation, and later Klein and Gordon, attempted to build relativistic quantum mechanics 
(RQM) from the squared relation 


E? =p +m’. (3.7) 
Using the operator replacements for Æ and p we are led to 
—076/dt? = (V? + m?) (3.8) 


which is the Klein-Gordon equation (KG equation). We consider the case of a one- 
component scalar wavefunction (a, t); one expects this to be appropriate for the description 
of spin-0 bosons. 


3.1.1 Solutions in coordinate space 


In terms of the D’Alembertian operator 


= 0,0" = — -—V? (3.9) 


the KG equation reads: 


(C+ m*)o(a, t) = 0. (3.10) 


Let us look for a plane-wave solution of the form 


læ, t) = Nei#tip © — Ne-ipe (3.11) 
where we have written the exponent in suggestive 4-vector scalar product notation 
p: £= pp," = Et-—p-« (3.12) 


and N is a normalization factor which need not be decided upon here (see section 8.1.1). In 
order that this wavefunction be a solution of the KG equation, we find by direct substitution 
that Æ must be related to p by the condition 


EP =p +m. (3.13) 


This looks harmless enough, but it actually implies that for a given 3-momentum p there 
are in fact two possible solutions for the energy, namely 


E=+(p?+m?)/?. (3.14) 


As Schrödinger and others quickly found, it is not possible to ignore the negative solutions 
without obtaining inconsistencies. What then do these negative-energy solutions mean? 


3.1.2 Probability current for the KG equation 


In exactly the same way as for the non-relativistic Schrödinger equation, it is possible to 
derive a conservation law for a ‘probability current’ of the KG equation. We have 


Oo 
Ot? 


and by multiplying this equation by ¢*, and subtracting ¢ times the complex conjugate of 
equation (3.15), one obtains, after some manipulation (see problem 3.1), the result 


V7*¢4+m'?¢=0 (3.15) 


Op ; 
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EE am 


i =i Vo- (Vod) (3.18) 


(the derivatives (0,*) act only within the bracket). In explicit 4-vector notation, this 
conservation condition reads (cf problem 2.1 and equation (D.4) in appendix D) 


where 


and 


d,j" =0 (3.19) 


with 

j" = (p, j) = id" 0"6 — (O"S") 4]. (3.20) 
Since @ of (3.11) is Lorentz invariant and ð” is a contravariant 4-vector, equation (3.20) 
shows explicitly that j” is a contravariant 4-vector, as anticipated in the notation. 

The spatial current 7 is identical in form to the Schrédinger current, but for the KG 
case the ‘probability density’ now contains time derivatives since the KG equation is second 
order in 0/0t. This means that p is not constrained to be positive definite—so how can 
p represent a probability density? We can see this problem explicitly for the plane-wave 


solutions a 
h = Ne eres (3.21) 


which give (problem 3.1) 
p = 2| NPE (3.22) 


and E can be positive or negative: that is, the sign of p is the sign of energy. 

Historically, this problem of negative probabilities coupled with that of negative energies 
led to the abandonment of the KG equation. For the moment we will follow history, and turn 
to the Dirac equation. We shall see in section 3.4, however, how the negative-energy solutions 
of the KG equation do after all have a role to play, following Feynman’s interpretation, in 
processes involving anti-particles. Later, in chapters 5-7, we shall see how this interpretation 
arises naturally within the formalism of quantum field theory. 


3.2 The Dirac equation 


In the case of the KG equation, it is clear why the problem arose: 


(a) In constructing a wave equation in close correspondence with the squared energy— 
momentum relation 
EB? =p? +m? 
we immediately allowed negative-energy solutions. 


(b) The KG equation has a 0?/0t? term: this leads to a continuity equation with a 
‘probability density’ containing 0/0t, and hence to negative probabilities. 


Dirac approached these problems in his characteristically direct way. In order to obtain 
a positive-definite probability density p > 0, he required an equation linear in 0/0t. Then, 
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for relativistic covariance (see later), the equation must also be linear in V. He postulated 
the equation (Dirac 1928) 


i vet) = | (a T2575 + 3 x) H Bm y(x, t) 
= (—ia: V +ßm)y(z,t). (3.23) 


What are the a’s and 8? To find the conditions on the a’s and 8, consider what we require 
of a relativistic wave equation: 


(a) the correct relativistic relation between E and p, namely 


E= +(p? a m?)1/2 


(b) the equation should be covariant under Lorentz transformations. 


We shall postpone discussion of (b) until the following chapter. To solve requirement (a), 
Dirac in fact demanded that his wavefunction w satisfy, in addition, a KG-type condition 


-8y = (V? + m? J. (3.24) 


We note with hindsight that we have once more opened the door to negative-energy so- 
lutions. Dirac’s remarkable achievement was to turn this apparent defect into one of the 
triumphs of theoretical physics! 

We can now derive conditions on a and 8. We have 


iðy/ðt = (—ia- V + Bm)w (3.25) 


and so, squaring the operator on both sides, 


9\2 
(=) py = (-ia-V+6m)(-ia:-V + Bm) 


3 3 
Py Py 
E 2 j (Ox*)? a aa 7 a) Ox' OxI 
E Sj 
RN OW a2 2 
-im (ai8 + Boa) x5 +8 mY. (3.26) 
i=1 


But by our assumption that w also satisfies the KG condition, we must have 


(2) v- a b my. (3.27) 


It is thus evident that the a’s and 8 cannot be ordinary, classical, commuting quantities. 
Instead they must satisfy the following anti-commutation relations in order to eliminate the 
unwanted terms on the right-hand side of equation (3.26): 


aib + bai = 0 A, 2, 3 (3.28) 
aja; taja = 0 1,9 =1,2,3; if 7. (3.29) 


In addition we require 
=f =i, (3.30) 
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Dirac proposed that the a’s and 8 should be interpreted as matrices, acting on a wave- 
function which had several components arranged as a column vector. Anticipating somewhat 
the results of the next section, we would expect that, since each such component obeys the 
same wave equation, the physical states which they represent would have the same energy. 
This would mean that the different components represent some degeneracy, associated with 
a new degree of freedom. 

The degree of freedom is, of course, spin—an entirely quantum mechanical angular mo- 
mentum, analogous to (but not equivalent to) orbital angular momentum. Consider, for 
example, the wavefunctions for the 2p state in the simple non-relativistic theory of the 
hydrogen atom. There are three of them, all degenerate with energy given by the n = 2 
Bohr energy. The three corresponding states all have orbital angular momentum quan- 
tum number / equal to 1; they differ in their values of the ‘magnetic’ quantum number 
m (i.e. the eigenvalue of the z-component of the orbital angular momentum operator Lz). 
Specifically, these three wavefunctions have the form (omitting normalization constants) 
(r sin 0e? , r sin 6e—'*, r cos @)e—"/?"® , where rg is the Bohr radius. Remembering the expres- 
sions for the Cartesian coordinates x, y, and z in terms of the spherical polar coordinates r, 
6, and ¢, we see that by a suitable linear combination (always allowed for degenerate states) 
we can write these wavefunctions as (x,y,z) f(r), where again a normalization factor has 
been omitted. In this form it is plain that the multiplicity of the p-state wavefunctions can 
be interpreted in simple geometrical terms: they are effectively the components of a vector 
(multiplication by the scalar function f(r) does not affect this). 

The several components of the Dirac wavefunction together make up a similar, but 
quite distinct, object called a spinor. We shall have more to say about this in chapter 4. 
For the moment we continue with the problem of finding the matrices a; and £ to satisfy 
(3.28)—(3.30) . 

As problem 3.2 shows, the smallest possible dimension of the matrices for which the 
Dirac conditions can be satisfied is 4 x 4. One conventional choice of the a’s and £ is 


a= (5 i a= (4 °) (3.31) 


where we have written these 4 x 4 matrices in 2 x 2 ‘block diagonal’ form, the o;’s are the 
2 x 2 Pauli matrices, 1 is the 2 x 2 unit matrix, and O is the 2 x 2 null matrix. The Pauli 
matrices (see appendix A) are defined by 


w= (5 T Aa m= (5 a (3.32) 


Readers unfamiliar with the labour-saving ‘block’ form of (3.31) should verify, both by using 
the corresponding explicit 4 x 4 matrices, such as 


0001 
0010 

a=|o1 00 (3.33) 
1000 


and so on, and by the block diagonal form, that this choice does indeed satisfy the required 
conditions. These are 


{anp} = 0 (3.34) 
{aiaj} = 26i51 (3.35) 
8 = 1 (3.36) 
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where {A,B} is the anti-commutator of two matrices, AB + BA, and 1 is here the 4 x 4 
unit matrix. 

At this point we can already begin to see that the extra multiplicity is very likely to have 
something to do with an angular momentum-like degree of freedom. In fact, if we define the 
spin matrices S by S = $0 (h=1), we find from (3.32) that 


[Sz, Sy] = iS, (3.37) 


(with obvious cyclic permutations), which are precisely the commutation relations satisfied 
by the components Jess Îy and Ĵ, of the angular momentum operator Jin quantum mechan- 
ics (see appendix A). Furthermore, the eigenvalues of S, are +4, and of S ? are s(s+1) with 
s = 5. So these matrices undoubtedly represent quantum mechanical angular momentum 
operators, appropriate to a state with angular momentum quantum number j = 5. This is 
precisely what ‘spin’ is. We will discuss this in more detail in section 3.3. 

It is important to note that the choice (3.31) of a and £8 is not unique. In fact, all matrices 
related to these by any unitary 4 x 4 matrix U (which thus preserves the anti-commutation 


relations) are allowed: 


a = Ua,U7! (3.38) 
B8 = usu". (3.39) 


Another commonly used representation is provided by the matrices 


cat 2 e a (3.40) 


The reader may check (problem 3.2) that these matrices also satisfy (3.34) — (3.36). 

Unless otherwise stated, we shall use the standard representation (3.31). This is generally 
convenient for ‘low energy’ applications—that is, when the momentum |p| is significantly 
smaller than the mass m. In that case, 8m will be the largest term in the Dirac Hamiltonian 
(see (3.23)), and it is sensible to have it in diagonal form. The choice (3.40), by contrast, is 
more natural when the mass is small compared with the energy or momentum. 


3.2.1 Free-particle solutions 


Since the Dirac Hamiltonian now involves 4 x 4 matrices, it is clear that we must interpret 
the Dirac wavefunction w as a four-component column vector—the so-called Dirac spinor. 
Let us look at the explicit form of the free-particle solutions. As in the KG case, we look 
for solutions in which the space-time behaviour is of plane-wave form and put 


yY = we?” (3.41) 


where w is a four-component spinor independent of x, and e~!?*, with p” = (E, p), is the 
plane-wave solution corresponding to 4-momentum p”. We substitute this into the Dirac 
equation 


iðy/ðt = (—ia- V + Bm)w (3.42) 


using the explicit œ and matrices. In order to use the 2 x 2 block form, it is conventional 
(and convenient) to split the spinor w into two two-component spinors ¢ and x: 


gs o (3.43) 
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We obtain the matrix equation (see problem 3.3) 


e(%) 7 te oa ($) (3.44) 


representing two coupled equations for @ and x: 

(E - m)ọ = 0 - px (3.45) 
and 

(E +m)x=0: pġ. (3.46) 


Solving for x from (3.46), the general four-component spinor may be written (without 
worrying about normalization for the moment) 


Q 
E+ mi 


What is the relation between E and p for this to be a solution of the Dirac equation? If we 
substitute x from (3.46) into (3.45) and remember that (problem 3.4) 


(a:p)? =p°1 (3.48) 


we find that 
(E — m)(E +m)¢ = p*¢ (3.49) 


for any ġ. Hence we arrive at the same result as for the KG equation in that for a given 
value of p, two values of E are allowed: 


= +(p? + m?)1/2 (3.50) 


i.e. positive and negative solutions are still admitted. 
The Dirac equation does not therefore solve this problem. What about the probability 
current? 


3.2.2 Probability current for the Dirac equation 
Consider the following quantity which we denote (suggestively) by p: 


p=" (x)¥(x). (3.51) 


Here Yt is the Hermitian conjugate row vector of the column vector 7. In terms of compo- 
nents 


p2 
Y3 
pa 
SO 
4 
p=) a? > 0 (3.53) 
a=1 


and we see that p is a scalar density which is explicitly positive-definite. This is one property 
we require of a probability density. In addition, we require a conservation law, coming from 
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the Dirac equation, and a corresponding probability current density. In fact (see problem 3.5) 
we can demonstrate, using the Dirac equation, 


id /Ot = (—ia- V + Bm) (3.54) 
and its Hermitian conjugate 
—iðyÝ = pt (tia + Bm) (3.55) 
that there is a conservation law of the required form 
Op/Ot+V-7 =0. (3.56) 
The notation utd requires some comment; it is shorthand for three row matrices 
wT, = Oy! /ðx etc 


(recall that %t is a row matrix). 
In equation (3.56), with p being given by (3.51), the probability current density 7 is 


jz) = yÝ (x)anp(z) (3.57) 
representing a 3-vector with components 
(wrod, p'ap, p aap). (3.58) 


We therefore have a positive-definite p and an associated 7 satisfying the required conser- 
vation law (3.56), which, as usual, we can write in invariant form as 0,,j = 0, where 


j* = (p, j). (3.59) 


Thus j” is an acceptable probability current, unlike the current for the KG equation—as 
we might have anticipated. 

The form of equation (3.56) implies that j” of (3.59) is a contravariant 4-vector (cf 
equation (D.4)), as we verified explicitly in the KG case. The corresponding verification 
is more difficult in the Dirac case, since the Dirac spinor w transforms non-trivially under 
Lorentz transformations, unlike the KG wavefunction ¢. We shall come back to this problem 
in chapter 4. 

We now turn to further discussion of the spin degree of freedom, postponing considera- 
tion of the negative-energy solutions until section 3.4. 


3.3 Spin 


Four-momentum is not the only physical property of a particle obeying the Dirac equation. 
We must now interpret the column vector (Dirac spinor) part, w, of the solution (3.41). The 
particular properties of the o-matrices, appearing in the a-matrices, have already led us to 
think in terms of spin. A further indication that this is correct comes when we consider the 
explicit form of w given in (3.47). In this equation, the two-component spinor ¢ is completely 
arbitrary. It may be chosen in just two linearly independent ways, for example 


6=(6) a=(1) (3.60) 
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which (as the notation of course indicates) are in fact eigenvectors of S, = io, with eigen- 
values +4 (‘up’ and ‘down’ along the z-axis). Remember that, in quantum mechanics, linear 
combinations of wavefunctions can be formed using complex numbers as superposition co- 


efficients, in general; so the most general ¢ can always be written as 


$= (3) = agy + bo, (3.61) 


where a and b are complex numbers. Hence, there are precisely two linearly independent 
solutions, for a given 4-momentum, just as we would expect for a quantum system with 
j= 4 (the multiplicity is 27 + 1, in general). 

In the rest frame of the particle (p = 0) this interpretation is straightforward. In this 
case choosing (3.60) for the two independent ¢’s, the solutions (3.61) for E = m reduce to 


—imt 
an (3.62) 


0 
1 
0 
0 
(a) (b) 


Since we have degeneracy between these two solutions (both have Æ = m), there must 
be some operator which commutes with the energy operator, and whose eigenvalues would 
distinguish the solutions (3.62). In this case the energy operator is just Bm (from (3.54) 
setting —iV to zero, since p = 0) and the required operator commuting with 8 is 


£, = i a (3.63) 


which has eigenvalues 1 (twice) and —1 (twice). Our rest-frame spinors appearing in (3.62) 
are indeed eigenstates of }X,, with eigenvalues +1 as can be easily verified. 
Generalizing (3.63), we introduce the three matrices X where 


== (5 ae (3.64) 


Then the operators su are such that 


1 
0 
0 
0 


lEs, Dy] = isd, (3.65) 


Nile 


and (48)? = 3I where I is now the unit 4 x 4 matrix. These are just the properties 
expected of quantum-mechanical angular momentum operators (see appendix A) belonging 
to magnitude j = $ (we already know that the eigenvalues of 4}, are +3). So we can 
interpret 5 as spin-5 operators appropriate to our rest-frame solutions; and—at least in 
the rest frame—we may say that the Dirac equation describes a particle of spin-$. 

It seems reasonable to suppose that the magnitude of a spin of a particle could not be 
changed by doing a Lorentz transformation, as would be required in order to discuss the 
spin in a general frame with p 4 0. But su is then no longer a suitable spin operator, 
since it fails to commute with the energy operator, which is now (a - p + 3m) from (3.54), 
for a plane-wave solution with momentum p. Yet there are still just two independent states 
for a given 4-momentum as our explicit solution (3.47) shows: ¢ can still be chosen in only 
two linearly independent ways. Hence there must be some operator which does commute 


with a- p+ 8m, and whose eigenvalues can be used to distinguish the two states. Actually 
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this condition is not enough to specify such an operator uniquely, and several choices are 
common. One of the most useful is the helicity operator h(p) defined by 
o-p 
|p| 
h(p) = En (3.66) 
|p| 


which (see problem 3.6) does commute with a-p+8m . We can therefore choose our general 
p #0 states to be eigenstates of h(p). These will be called ‘helicity states’; physically they 
are eigenstates of X resolved along the direction of p. 

By using (3.48), it is easy to see that the eigenvalues of h(p) are +1 (twice) and —1 
(twice). Our general four-component spinor (3.47) is therefore an eigenstate of h(p) if 


op 
po ° o \ [ ¢ bea 
oOo: p g: Pp T g: Pp $ jj 
Ty E+m E+m 
Ip| 
Taking the + sign first, this will hold if 
o-p 
—o,=¢ 3.68 
ipl + + ( ) 


where the + subscript has been added to indicate that this ¢ is a solution of (3.68). Such 
a @+ is called a two-component helicity spinor. The explicit form of @, can be found by 
solving (3.68)—see problem 3.7. Similarly, the four-component spinor will be an eigenstate 
of h(p) belonging to the eigenvalue —1 if it contains ¢_ where 


la eee ee 3.69 
pi Q (3.69) 


Again, these two choices ¢, and ġ_— are linearly independent. 


3.4 The negative-energy solutions 


In this section we shall first look more closely at the form of both the positive- and negative- 
energy solutions of the Dirac equation, and we shall then concentrate on the physical inter- 
pretation of the negative-energy solutions of both the Dirac and the KG equations. 

It will be convenient, from now on, to reserve the symbol ‘E’ for the positive square root 
in (3.50): E = +(p? + m7). The general 4-momentum in the plane-wave solution (3.41) will 
be denoted by p” = (p°, p) where p? may be either positive or negative. With this notation 


equation (3.44) becomes 
o\_( ml o-p\(¢ 
f eee r) (o (3.70) 


in our original representation for a and /. 


3.4.1 Positive-energy spinors 


Positive-energy spinors are such that 


pP =4+(p? +m)? = E> 0. (3.71) 
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We eliminate x and obtain positive-energy spinors in the form 
gl? 

w2=N| o D 

E+m 


gej (3.72) 


with ¢!' 6! = ¢?'¢? = 1. We shall now choose N so that for these positive-energy solutions 
wtw = 2E. In this case the spinors will be denoted by u(p, s), where (problem 3.8) 


om 
u(p,s) =(E+m)/? 1 o -p g s=1,2 (3.73) 
E+m 


and s labels the spin degree of freedom in some suitable way (e.g. the helicity eigenvalues). 
The complete plane-wave solution w for such a positive 4-momentum state is then 


b= up, sje P+? (3.74) 


with p4 = (E, p). 


3.4.2 Negative-energy spinors 


Now we look for spinors appropriate to the solution 
pP = -pP +m =F <0 (3.75) 


(E is always defined to be positive). Consider first what are appropriate solutions at rest. 
We have now 


p? = -m p=0 (3.76) 
and b 
7 o\ [mi 0 
m & SAU ere ie (3.77) 
leading to 
¢=0. (3.78) 
Thus the two independent negative-energy solutions at rest are just 
0 0 
wip =—m,s)=|_, }. (3.79) 
X 
The solution for finite momentum +p, i.e. for 4-momentum (—£, p), is then 
-0 . P 3 
w(p? =—E,p,s)=| E+m (3.80) 
x? 


with yty = 1. However, it is clearly much more in keeping with relativity if, in addition to 
changing the sign of E, we also change the sign of p and consider solutions corresponding 


to negative 4-momentum (—E,—p) = —pf . We therefore define 
T-P i2 
w(p = —E,—-p,5)= wt =N | Etm i (3.81) 


yl? 
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Positive-energy 
continuum E > m 


Negative-energy 
continuum E <—m 


FIGURE 3.1 
Energy levels for Dirac particle. 


Adopting the same N as in (3.73) implies the same normalization (wtw = 2E) for (3.81) as 
in (3.73); in this case the spinors are called v(p, s) where (problem 3.8) 
op , 
X 
v(p,s) = (E+m) 2 | E+m s=1,2. (3.82) 


S 


X 


(There is a small subtlety in the choice of x! and y? which we will come to shortly.) The 
solution % for such negative 4-momentum states is then 


Y = v(p, sei P+) ® = v(p, sel? +, (3.83) 


3.4.3 Dirac’s interpretation of the negative-energy solutions of the 
Dirac equation 


The physical interpretation of the positive-energy solution (3.74) is straightforward, in terms 
of the p and J given in section 3.2.2. They describe spin-5 particles with 4-momentum (F, p) 
and spin appropriate to the choice of ¢°; p and the energy p° are both positive. 

Unfortunately p is also positive for the negative-energy solutions (3.83), so we cannot 
eliminate them on that account. This means that for a free Dirac particle (e.g. an electron) 
the available positive- and negative-energy levels are as shown in figure 3.1. This, in turn, 
implies that a particle with initially positive energy can ‘cascade down’ through the negative- 
energy levels, without limit; in this case no stable positive-energy state would exist! 

In order to prevent positive-energy electrons making transitions to the lower, negative- 
energy states, Dirac postulated that the normal ‘empty’, or ‘vacuum’, state—that with 
no positive-energy electrons present—is such that all the negative-energy states are filled 
with electrons. The Pauli exclusion principle then forbids any positive-energy electrons from 
falling into these lower energy levels. The ‘vacuum’ now has infinite negative charge and 
energy, but since all observations represent finite fluctuations in energy and charge with 
respect to this vacuum, this leads to an acceptable theory. For example, if one negative- 
energy electron is absent from the Dirac sea, we have a ‘hole’ relative to the normal vacuum: 


energy of ‘hole’ = —(Eneg) —> positive energy 
charge of ‘hole’ = -—(qe) + positive charge. 
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Thus the absence of a negative-energy electron is equivalent to the presence of a positive- 
energy positively charged version of the electron, that is a positron. In the same way, the 
absence of a ‘spin-up’ negative-energy electron is equivalent to the presence of a ‘spin-down’ 
positive-energy positron. This last point is the reason for the subtlety in the choice of xê 
mentioned after (3.82) and we choose 


x = (2) xX = G (3.84) 


the opposite way round from the choice for the positive-energy spinors (3.73). 

Dirac’s brilliant re-interpretation of (unfilled) negative-energy solutions in terms of anti- 
particles is one of the triumphs of theoretical physics?. Carl Anderson received the Nobel 
Prize for his discovery of the positron in 1932 (Anderson 1932). 

In this way it proved possible to obtain sensible results from the Dirac equation and its 
negative-energy solutions. It is clear, however, that the theory is no longer really a ‘single- 
particle’ theory, since we can excite electrons from the infinite ‘sea’ of filled negative-energy 
states that constitute the normal ‘empty state’. For example, if we excite one negative-energy 
electron to a positive-energy state, we have in the final state a positive-energy electron plus a 
positive-energy positron ‘hole’ in the vacuum. This corresponds physically to the process of 
ete” pair creation. Thus this way of dealing with the negative-energy problem for fermions 
leads us directly to the need for a quantum field theory. The appropriate formalism will be 
presented later, in section 7.2. 


3.4.4 Feynman’s interpretation of the negative-energy solutions of the 
KG and Dirac equations 


It is clear that despite its brilliant success for spin-4 particles, Dirac’s interpretation cannot 
be applied to spin-0 particles, since bosons are not subject to the exclusion principle. Besides, 
spin-0 particles also have their corresponding anti-particles (e.g. 7* and 77), and so do spin- 
1 particles (W+ and W7, for instance). A consistent picture for both bosons and fermions 
does emerge from quantum field theory, as we shall see in chapters 5-7, which is perhaps one 
of the strongest reasons for mastering it. Nevertheless, it is useful to have an alternative, 
non-field-theoretic, interpretation of the negative-energy solutions which works for both 
bosons and fermions. Such an interpretation is due to Feynman. In essence, the idea is that 
the negative 4-momentum solutions will be used to describe anti-particles, for both bosons 
and fermions. 

We begin with bosons—for example pions, which for the present purposes we take to be 
simple spin-0 particles whose wavefunctions obey the KG equation. We decide by convention 
that the 77 is the ‘particle’. We will then have 


positive 4-momentum 7” solutions: Ne~!?* (3.85) 
negative 4-momentum 77 solutions: Ne”. (3.86) 
where p = [(m? + p?)!/?,p]. The electromagnetic current for a free physical (positive- 


energy) 7 is given by the probability current for a positive-energy solution multiplied by 
the charge Q(= +e): 


+) = (+e) x (probability current for positive energy m”) (3.87) 
= (+)2|NP[(m? + p*)/?, p] (3.88) 


Jem (7 


2 At that time, this was not universally recognized. For example, Pauli (1933) wrote: ‘Dirac has tried to 
identify holes with anti-electrons...we do not believe that this explanation can be seriously considered.’ 
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(a) () 


FIGURE 3.2 
Coulomb scattering of a m~ by a static charge Ze illustrating the Feynman interpretation 
of negative 4-momentum states. 


using (3.20) and (3.85) (see problem 3.1). What about the current for the n~? For free 
physical m~ particles of positive energy (m? + p?)!/? and momentum p, we expect 


ibn (m~) = (—€)2|N[?[(m? + p?)”?, p] (3.89) 


by simply changing the sign of the charge in (3.88). But it is evident that (3.89) may be 
written as 
Shin (#~) = (+)2|NP[-(m? + p?)/?, —p] (3.90) 


which is just j4,(7*) with negative 4-momentum. This suggests some equivalence between 
anti-particle solutions with positive 4-momentum and particle solutions with negative 4- 
momentum. 

Can we push this equivalence further? Consider what happens when a system A absorbs 
ant with positive 4-momentum p; its charge increases by +e and its 4-momentum increases 
by p. Now suppose that A emits a physical m7 with 4-momentum k, where the energy k° 
is positive. Then the charge of A will increase by +e, and its 4-momentum will decrease 
by k. Now this increase in the charge of A could equally well be caused by the absorption 
of a 7+—and indeed we can make the effect (as far as A is concerned) of the 7~ emission 
process fully equivalent to a * absorption process if we say that the equivalent absorbed 
mt has negative 4-momentum, —k; in particular the equivalent absorbed 7* has negative 
energy —k°. In this way, we view the emission of a physical ‘anti-particle’ m7 with positive 
4-momentum k as equivalent to the absorption of a ‘particle’ +> with (unphysical) negative 
4-momentum —k. Similar reasoning will apply to the absorption of a m~ of positive 4- 
momentum, which is equivalent to the emission of a 7* of negative 4-momentum. Thus we 
are led to the following hypothesis (due to Feynman): 


The emission (absorption) of an anti-particle of 4-momentum p" is physically equivalent 
to the absorption (emission) of a particle of 4-momentum —p". 


In other words the unphysical negative 4-momentum solutions of the ‘particle’ equation 
do have a role to play: they can be used to describe physical processes involving positive 
4-momentum anti-particles, if we reverse the role of ‘entry’ and ‘exit’ states. 

The idea is illustrated in figure 3.2, for the case of Coulomb scattering of a m~ particle 
by a static charge Ze, which will be discussed later in section 8.1.3. By convention we are 
taking m7 to be the anti-particle. In the physical process of figure 3.2(a) the incoming 
physical anti-particle 7~ has 4-momentum p;, and the final 7~ has 4-momentum pt; both 
E; and Ep are, of course, positive. Figure 3.2(b) shows how the amplitude for the process 
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can be calculated using 7* solutions with negative 4-momentum. The initial state m~ of 
4-momentum p; becomes a final state nt with 4-momentum —p;, and similarly the final 
state m7 of 4-momentum pr becomes an initial state 7* of 4-momentum —p,r. Note that in 
this and similar figures, the sense of the arrows always indicates the ‘flow’ of 4-momentum, 
positive 4-momentum corresponding to forward flow. 

It is clear that the basic physical idea here is not limited to bosons. But there is a 
difference between the KG and Dirac cases in that the Dirac equation was explicitly designed 
to yield a probability density (and probability current density) which was independent of 
the sign of the energy: 


p=vty j=vtay. (3.91) 
Thus for any solutions of the form 
p = wd(a, t) (3.92) 
we have 
p = wtuld(s,t)? (3.93) 
and 
j = w! aw|d(a, t)|? (3.94) 


and p > 0 always. We nevertheless want to set up a correspondence so that positive- 
energy solutions describe electrons (taken to be the ‘particle’, by convention, in this case) 
and negative-energy solutions describe positrons, if we reverse the sense of incoming and 
outgoing waves. For the KG case this was straightforward, since the probability current was 
proportional to the 4-momentum: 

gh (KG) ~ p”. (3.95) 
We were therefore able to set up the correspondence for the electromagnetic current of m” 
and m~: 


at: gt. ~o ep” positive energy m” (3.96) 
mT: jh, ~ (-e)pt positive energy m` (3.97) 
= (+e)(—p*) negative energy T”. (3.98) 


This simple connection does not hold for the Dirac case since p > 0 for both signs of 
the energy. It is still possible to set up the correspondence, but now an extra minus sign 
must be inserted ‘by hand’ whenever we have a negative-energy fermion in the final state. 
We shall make use of this rule in section 8.2.4. We therefore state the Feynman hypothesis 
for fermions: 


The invariant amplitude for the emission (absorption) of an anti-fermion of 4- 
momentum p” and spin projection sz in the rest frame is equal to the amplitude (minus 
the amplitude) for the absorption (emission) of a fermion of 4-momentum —p" and spin 
projection —s, in the rest frame. 


As we shall see in chapters 5-7, the Feynman interpretation of the negative-energy 
solutions is naturally embodied in the field theory formalism. 


E: SeSe e 


3.5 Inclusion of electromagnetic interactions via the gauge 
principle: the Dirac prediction of g = 2 for the electron 


Having set up the relativistic spin-0 and spin-4 free-particle wave equations, we are now in 
a position to use the machinery developed in chapter 2, in order to include electromagnetic 
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interactions. All we have to do is make the replacement 
Ə! + D! = OF + iqA" (3.99) 


for a particle of charge q. For the spin-O KG equation (3.10) we obtain, after some rear- 
rangement (problem 3.9 


aq 


(A+m*)d = —ig(d,A" + AYO.) + PAH (3.100) 
= -Vic¢. (3.101) 


Note that the potential Vc contains the differential operator O,; the sign of Vg is a 
convention chosen so as to maintain the same relative sign between V? and V as in the 
Schrödinger equation—for example that in (A.5). 

For the Dirac equation, the replacement (3.99) leads to 


i— =[a-(-iV — qA) + Bm + qA"Ww (3.102) 


where A“ = (A°, A). The potential due to A“ is therefore Vp = qA°1 — qa- A, which is a 
4 x 4 matrix acting on the Dirac spinor. 

The non-relativistic limit of (3.102) is of great importance, both physically and histor- 
ically. It was, of course, first obtained by Dirac; and it provided, in 1928, a sensational 
explanation of why the g-factor of the electron had the value g = 2, which was then the 
empirical value, without any theoretical basis. 

By way of background, recall from appendix A that the Schrödinger equation for a non- 
relativistic spinless particle of charge q in a magnetic field B described by a vector potential 
A such that B = V x A is 

1 o2 ao pp C gh, 20 
-am Y V- nP t anA bain (3.103) 
Taking B along the z-axis, the B - L term will cause the usual splitting (into states of 
different magnetic quantum number) of the (2/ + 1)-fold degeneracy associated with a state 
of definite l. In particular, though, there should be no splitting of the hydrogen ground 
state which has l = 0. But experimentally splitting into two levels is observed, indicating a 
two-fold degeneracy and thus (see earlier) a j = $-like degree of freedom. 

Uhlenbeck and Goudsmit (1925) suggested that the doubling of the hydrogen ground 
state could be explained if the electron were given an additional quantum number corre- 
sponding to an angular-momentum-like observable, having magnitude j = Z. The operators 
S= io which we have already met serve to represent such a spin angular momentum. 
If the contribution to the energy operator of the particle due to its spin S enters into the 
effective Schrödinger equation in exactly the same way as that due to its orbital angular 
momentum, then we would expect an additional term on the left-hand side of (3.103) of the 
form 

-I B.S. (3.104) 
2m 
The corresponding wavefunction must now have two (spinor) components, acted on by the 
2 x 2 matrices in S. 

The energy difference between the two levels with eigenvalues S, = +3 would then be 
qB/2m in magnitude. Experimentally the splitting was found to be just twice this value. 
Thus empirically the term (3.104) was modified to 


-9B .S (3.105) 
m 
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where g is the ‘gyromagnetic ratio’ of the particle, with g ~ 2. Let us now see how Dirac 
deduced the term (3.105), with the precise value g = 2, from his equation. 

To achieve a non-relativistic limit, we expect that we have somehow to reduce the four- 
component Dirac equation to one involving just two components, since the desired term 
(3.105) is only a 2 x 2 matrix. Looking at the explicit form (3.72) for the free-particle 
positive-energy solutions, we see that the lower two components are of order v (i.e. v/c with 
c = 1) times the upper two. This suggests that, to get a non-relativistic limit, we should 
regard the lower two components of w as being small (at least in the specific representation 
we are using for œ and 3). However, since (3.102) includes the A#-field, this will have to 
be demonstrated (see (3.112)). Also, if we write the total energy operator as m + Hy, we 
expect H, to be the non-relativistic energy operator. 


We let 
j= e (3.106) 


where W and ® are not free-particle solutions, and they carry the space-time dependence 
as well as the spinor character (each has two components). We set 


Ay =a-(-iV —qA)+6m+qA°—m (3.107) 


where a 4 x 4 unit matrix multiplying the last two terms is understood. Then 


(8) = (eetan TS) (8) 
E (3) + qA° & . (3.108) 


Multiplying out (3.108), we obtain 


Aw = o-(-iV—qA)®+qA°vV (3.109) 
Ayo = o-(-iV—qA)¥ + qA°S — 2mo. (3.110) 


From (3.110), we obtain 
(H, — qA° + 2m)® = o - (-iV — qA)w. (3.111) 


So, if Ñ, (or rather any matrix element of it) is < m and if A? is positive or, if negative, 
much less in magnitude than m/e, we can deduce 


® ~ (velocity) x Y (3.112) 


as in the free case, provided that the magnetic energy ~ o - A is not of order m. Further, if 
Hı < mand the conditions on the fields are met, we can drop H; and qA° on the left-hand 
side of (3.111), as a first approximation, so that 


_ o: (—iV-qA) 
Dana 0. (3.113) 
Hence, in (3.109), 
A 1 
AW x e - (—iV — qA) PY + qA°w. (3.114) 
m 


The right-hand side of (3.114) should therefore be the non-relativistic energy operator for 
a spin-4 particle of charge q and mass m in a field A”. 
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Consider then the case A° = 0 which is sufficient for the discussion of g. We need to 


evaluate 
{o -(-iV — qA)}°W. (3.115) 


This requires care, because although it is true that (for example) (ø - p)? = p° if p = 
(Pa, Py, Pz) ave ordinary numbers which commute with each other, the components of ‘—i V — 
qA’ do not commute due to the presence of the differential operator V, and the fact that 
A depends on r. In problem 3.10 it is shown that 


{o -(-iV — qA)}*W = (-iV — qA)’ Y — qo - BY. (3.116) 


The first term on the right-hand side of (3.116) when inserted into (3.114), gives precisely 
the spin-0 non-relativistic Hamiltonian appearing on the left-hand side of (3.103) (see ap- 
pendix A), while the second term in (3.116) yields exactly (3.105) with g = 2, recalling that 
S= so. Thus the non-relativistic reduction of the Dirac equation leads to the prediction 
g=2 fora spin-4 particle. 

In actual fact, the measured g-factor of the electron (and muon) is slightly greater than 
this value: gexp = 2(1 +a). The ‘anomaly’ a, which is of order 107° in size, is measured with 
quite extraordinary precision (see section 11.7) for both the e~ and e*. This small correction 
can also be computed with equally extraordinary accuracy, using the full theory of QED, 
as we shall briefly explain in chapter 11. The agreement between theory and experiment is 
phenomenal and is one example of such agreement exhibited by our ‘paradigm theory’. 

It may be worth noting that spin-4 hadrons, such as the proton, have g-factors very 
different from the Dirac prediction. This is because they are, as we know, composite objects 
and are thus (in this respect) more like atoms in nuclei than ‘elementary particles’. 


SS 
Problems 
3.1 


(a) In natural units A = c = 1 and with 2m = 1, the Schrödinger equation may be 
written as 


—V?b+ Vy —idp/dt = 0. 


Multiply this equation from the left by y* and multiply the complex conjugate 
of this equation by w (assume V is real). Subtract the two equations and show 
that your answer may be written in the form of a continuity equation 


Op/Ot+V-7 =0 


where p = y* and j = i™'[4* (VY) — (Vy* yy]. 
(b) Perform the same operations for the Klein-Gordon equation and derive the corre- 
sponding ‘probability’ density current. Show also that for a free-particle solution 


o=Ne'?* 
with p” = (E, p), the probability current j” = (p, j) is proportional to p”. 
3.2 


(a) Prove the following properties of the matrices a; and £: 
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(i) a; and 8 (i = 1,2,3) are all Hermitian [Aint: what is the Hamiltonian?]. 

(ii) Tra; = Tr8 = 0 where ‘Tr’ means the trace, i.e. the sum of the diagonal 
elements [hint: use Tr( AB) = Tr(BA) for any matrices A and B—and 
prove this tool]. 


(iii) The eigenvalues of a; and 8 are +1 [hint: square a; and 6]. 
(iv) The dimensionality of a; and £ is even [hint: the trace of a matrix is equal 
to the sum of its eigenvalues]. 


(b) Verify explicitly that the matrices a and 8 of (3.31), and of (3.40), satisfy the 
Dirac conditions (3.34)—(3.36). 


3.3 For free-particle solutions of the Dirac equation 
p = we iP? 


the four-component spinor w may be written in terms of the two-component spinors 
ee o) l 
X 


idw/dt = (—ia- Y + Bm) 


From the Dirac equation for w 


using the explicit forms for the Dirac matrices 


where p“ = (E, p). 
3.4 


(a) Using the explicit forms for the 2 x 2 Pauli matrices, verify the commutation 
(square brackets) and anti-commutation (braces) relation [note the summation 
convention for repeated indices: €;;,0% = Do EijkOk]: 


loi, 75] = 2icijkTk {oi cj} = 2641 
where €;;x is the usual antisymmetric tensor 


—1 for an odd permutation of 1, 2, 3 


+1 for an even permutation of 1, 2, 3 
Eijk = 
0 if two or more indices are the same, 


dij is the usual Kronecker delta, and 1 is the 2 x 2 matrix. Hence show that 


OiOj = dig 1L + 1€ijkOk- 
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(b) Use this last identity to prove the result 
(a0-a)(o-b)=a-bl+io-ax b. 


Using the explicit 2 x 2 form for 
o-p=( Pe — 
Px + tpy —Pz 


(o: p)’ = p’°1. 


show that 


3.5 Verify the conservation equation (3.56). 


3.6 Check that h(p) as given by (3.66) does commute with a: p+ 8m, the momentum-space 
free Dirac Hamiltonian. 


3.7 Let ¢ be an arbitrary two-component spinor and & be a unit vector. 


(i) Show that $(1+o-@)¢ is an eigenstate of ø -& with eigenvalue +1. The operator 
$(1 +ø &) is called a projector operator for the ø - & = +1 eigenstate since 
when acting on any ¢ this is what it ‘projects out’. Write down a similar operator 
which projects out the ø - ù = —1 eigenstate. 

(ii) Construct two two-component spinors 6; and ¢_ which are eigenstates of ø - t 
belonging to eigenvalues +1, and normalized to løs = ôrs for (r,s) = (+, —), 
for the case & = (sin @ cos ¢, sin 0 sin @, cos 0) [hint: take the arbitrary ¢ = (Gil. 


3.8 Positive-energy spinors u(p, s) are defined by 
u(p, s) = (E + m)! o:-p > s=1,2 
E+ me 


with ¢°'¢° = 1. Verify that these satisfy u'u = 2E. 
In a similar way, negative-energy spinors v(p, s) are defined by 
op y 
u(p,s)=(B+m)? | E+m s=1,2 
x? 


with x*ty° = 1. Verify that vtv = 2E. 


3.9 Using the KG equation together with the replacement 0“ — ð” + iqA®, find the form 
of the potential Vka in the corresponding equation 


(O +m?) = —Vxae 


in terms of A“. 


3.10 Evaluate 
{o -(-iV — gA)}Py 


by following the subsequent steps (or doing it your own way): 
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(i) Multiply the operator by itself to get 
{(o--iV)* + ig(o- V)(o- A) +iqlo : A)(o-V) +g (0: A)? y. 


The first and last terms are, respectively, -V° and q? A? where the 2 x 2 unit 
matrix 1 is understood. The second and third terms are ig(a -V)(o - Ay) and 
ig(a - A)(o- Vw). These may be simplified using the identity of problem 4.4(b), 
but we must be careful to treat V correctly as a differential operator. 

(ii) Show that (o-V)(o- Ay = V- (Aw) +io- {V x (Aw) }. Now use V x (Aq) = 
(V x A)w — A x Vw to simplify the last term. 

(iii) Similarly, show that (ø - A)(o-V)y=A-Viyp+io- (Ax Vy). 

(iv) Hence verify (3.116). 


A 


Lorentz Transformations and Discrete Symmetries 


In this chapter we shall review various covariances (see appendix D) of the KG and Dirac 
equations, concentrating mainly on the latter. First, we consider Lorentz transformations 
(rotations and velocity transformations) and show how the scalar KG wavefunction and the 
4-component Dirac spinor must transform so that the respective equations are covariant 
under these transformations. Then we perform a similar task for the discrete transforma- 
tions of parity, charge conjugation and time reversal. The results enable us to construct 
‘bilinear covariants’ having well-defined behaviour (scalar, pseudoscalar, vector, etc.) under 
these transformations. This is essential for later work, for two reasons: first, we shall be able 
to do dynamical calculations in a way that is manifestly covariant under Lorentz transfor- 
mations; and secondly we shall be ready to study physical problems in which the discrete 
transformations are, or are not, actual symmetries of the real world, a topic to which we 
shall return in the second volume. 


4.1 Lorentz transformations 
4.1.1 The KG equation 


In order to ensure that the laws of physics are the same in all inertial frames, we require our 
relativistic wave equations to be covariant under Lorentz transformations—that is, they 
must have the same form in the two different frames (see appendix D). In the case of the 
KG equation 


(E + m*)d(x) = ~ig, A" (x) + A” (x)d,]6(x) + q°A*(2) (2) (4.1) 


for a particle of charge q in the field A“, this requirement is taken care of, almost au- 
tomatically, by the notation. Consider a Lorentz transformation such that x > x’. AM 
will transform by the usual 4-vector transformation law (i.e. like 7), which we write as 
A(x) > A’ (x'). Similarly we write the transform of ¢ as $(x) > ¢'(x'). Then in the 
primed coordinate frame physics must be described by the equation 


(C! + m*)¢!(a") = —iglð A” (2) + A’*(a") 0,6 (2) + 9° A(x’) o'(2'). (4.2) 


Now the 4-dimensional dot products appearing in (4.2) are all invariant under the Lorentz 
transformation, so that (4.2) can be written as 


(E + m*)¢!(a") = —ig(0, A" (x) + AY(x)d,]9"(2') + A(x) (2), (4.3) 


and we see that the wavefunction in the primed frame may be identified (up to a phase) 
with that in the unprimed frame: 


p(x") = o(2). (4.4) 
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Equation (4.4) is the condition for the KG equation to be covariant under Lorentz transfor- 
mations. Since x’ is a known function of x, given by the angles and velocities parametrizing 
the transformation, equation (4.4) enables one to construct the correct function ¢’ which 
the primed observers must use, in order to be consistent with the unprimed observers. 

By way of illustration, consider a rotation of the coordinate system by an angle a in 
a positive sense about the x-axis; then the position vector referred to the new system is 


x’ = (2’,y’,z') where 


í 1 0 0 


x x 
y | =ļ|0 cosa sina yl, (4.5) 
A 0 —sina cosa z 
which we shall write as 
es Ra) z. (4.6) 
Correspondingly, equation (4.4) is, in this case, 
¢' (Rez (a) x) = (x), (4.7) 
which can also be written as 
p' (x) = (Rz (a) x). (4.8) 


It is convenient to begin with an ‘infinitesimal rotation’, where the angle a in (4.5) is 
replaced by €x such that cose, ~ 1 and sine, % €y. Then it is easy to verify that (4.5) 
becomes 

ve =R,(e,)"=2-EXS (4.9) 


where € = (€,,0,0). For a general infinitesimal rotation, we simply replace this € by a 
general one, (€x, €y, €z). For such a rotation, condition (4.8) becomes 


p' (x)= lx +ex z). (4.10) 
Expanding the right hand side to first order in € we obtain 


p(x) = ọls)+ (exx): Vo= p(x) +e: (xx V) 
= (1+ie- È)o(x) (4.11) 


where Ê is the vector angular momentum operator « x —iV. 
The rule for finite rotations may be obtained from the infinitesimal form by using the 
result 
ef = lim (1+ A/n)” (4.12) 


n—> o0 

generalized to differential operators (the exponential of a matrix being understood as the 
infinite series expA = 1 + A + 44° +... ). Let € = a/n, where a = (ax, Qy, @z) are 
three real finite parameters; we may think of the direction of œ as representing the axis 
of the rotation, and the magnitude of @ as representing the angle of rotation. Then applying 
the transformation (4.11) n times, and letting n tend to infinity, we obtain for the finite 
rotation | 

g'(x) = 8°" 4(x) = On(a) (2). (4.13) 
Note that Ûr (œ) is a unitary operator, since UL is the inverse rotation. 

Equation (4.13) is, of course, the familiar rule for rotations of scalar wavefunctions, 
exhibiting the intimate connection between rotations and angular momentum in quantum 
mechanics. We recall that if a Hamiltonian is invariant under rotations, then the operators 
Ê commute with the Hamiltonian and angular momentum is conserved. 

A similar calculation may be done for velocity transformations (‘boosts’), leading to 
corresponding operators K—see problem 4.1. 
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4.1.2 The Dirac equation 


The case of the Dirac equation is more complicated, because (unlike the KG @) the wave- 
function has more than one component, corresponding to the fact that it describes a spin-1/2 
particle. There is, however, a direct connection between the angular momentum associated 
with a wavefunction, and the way that the wavefunction transforms under rotations of the 
coordinate system. To take a simple case, the 2p wavefunctions mentioned in section 3.2 
correspond to l = 1 on the one hand and, on the other, to the components of a vector— 
indeed the most basic vector of all, the position vector x = (x,y,z) itself. If we rotate the 
coordinate system in the way represented by (4.5), the components in the primed system 
transform into simple linear combinations of the components in the original system. 

Very much the same thing happens in the case of spinor wavefunctions, except that 
they transform in a way different from—though closely related to—that of vectors. In the 
present section we shall discuss how this works for three-dimensional rotations of the spatial 
coordinate system, and explain how it generalizes to boosts (i.e. Lorentz velocity transfor- 
mations), which include transformations of the time coordinate as well. It will be convenient 
to use the alternative representation (3.40) for the Dirac matrices. In this representation, 
the components ¢, x of the free-particle 4-spinor w of (3.43) satisfy 


Eọ = o-pdé+my (4.14) 
Ex = -o-py+mo¢ (4.15) 


rather than (3.45) and (3.46). 
As before, we start with the infinitesimal rotation (4.9). Since p is a vector, it transforms 
in the same way as x, so that under an infinitesimal rotation p becomes p’ where 


p =p—exXp. (4.16) 


The question for us now is — how do the spinors ¢ and x transform under this same rotation 
of the coordinate system? 

The essential point is that in the new coordinate system the defining equations (4.14) 
and (4.15) should take exactly the same form, namely 


E¢ = oa-p'¢’+my’ (4.17) 
Ex = -o-p'x’+m¢' (4.18) 


where ¢’ and x’ are the spinors in the new coordinate system, and we have used the fact 
that both E and m do not change under rotations. Our task is to find ¢’ and y’ in terms 
of ¢ and x. 
Since both ¢ and x are 2-component spinors, we might guess from (4.11) that the answer 
is 
P =(1+io-€/2), X = (I tio -€/2x, (4.19) 
since the a /2 are the spin-1/2 matrices, taking the place of L. To check that this is, in fact, 
the correct transformation law, we proceed as follows. First, multiply (4.14) from the left 
by the matrix (1+ io - €/2); then, since Æ and m commute with all matrices, the result is 


E% = (1+io-€/2)o-pd+my’ (4.20) 
= (l+io-€/2)o0-p(1—io- €/2)¢' + my’ (4.21) 


1We shall derive (4.19), and the corresponding rule for velocity transformations, equation (4.42), in 
appendix M of volume 2 using group theory. 
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where we have used 
(1 +iø -¢€/2)-! x (1—io- €/2) (4.22) 


to first order in e. Keeping only first order terms in e€, the first term on the right-hand side 
of (4.21) is 


(o-ptsio-eo-p—sio-po-e)d. (4.23) 
This can be simplified using the result from problem 3.4(b): 
o:ao-b=a-b+ia-axb, (4.24) 
provided all the components of a and b commute. Applying (4.24), (4.23) becomes 


i 


[o-p+s(€-ptio-exp) zE Ptio px eg (4.25) 
=(0-p-—a-exp)=a-p'd’. (4.26) 

Hence (4.21) is just 
Eg =o -p'¢' +m,’ (4.27) 


as required in (4.17). We can similarly check the correctness of the transformation law (4.19) 
for x. 

The transformation rule for a finite rotation may be obtained from the infinitesimal 
form by using the result (4.12) applied to matrices. Then, for a finite rotation we obtain 
the result 

g =exp(ia-a/2)¢, x =exp(io- a/2) x. (4.28) 


We note that the behaviour of ¢ and x under rotations is the same; equation (4.28) is the 
way all 2-component spinors transform under rotations. 

By way of an illustration, consider the case of the finite rotation (4.5). Here a = (a, 0,0), 
and the transformation matrix is 


1 
exp(io,a/2) = 1 +ioza/2 + 5 0720/2)? +... (4.29) 


Multiplying out the terms in (4.29) and remembering that o2 = 1, we see that the trans- 
formation matrix is 


(4.30) 


ee _ ù cosa/2 isina/2 
cos a/2 + io; sina/2 = ( isina/2 cosa/2 ) 


This means that the components ¢1, ¢2 of the spinor ¢ transform according to the rule 


On cos a/2 ġı +isina/2 do (4.31) 
og) = isina/2 ġı +cosa/2 da, (4.32) 


for this particular rotation. The transformed components are linear combinations of the 
original components, but it is the half-angle a/2 that enters, not a. 
Let us denote the finite transformation matrix by U, so that 


U =exp(io-a/2) and Ut =exp(—io- a/2). (4.33) 


It follows that 
UU =U'U =1, (4.34) 
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since the rotation parametrized by —q@ clearly undoes the rotation parametrized by a. So 
U is a 2 x 2 unitary matrix. It follows that the normalization of ¢ and x is preserved 
under rotations: d'o’ = o'¢, and x'ty’ = y'y. The free-particle Dirac probability density 
p = Ytp = ¢'¢+y'y is therefore also (as we expect) invariant under rotations. 

More interestingly, we can examine the way the free-particle current density 


j= lap = dlad—ylox (4.35) 


transforms under rotations. Of course, it should behave as a 3-vector, and this is checked 
in problem 4.2(a). 

We now turn to the behaviour of the spinors ¢ and x under boosts, which mix æ and t, 
or equivalently p and E. For example, consider a Lorentz velocity transformation (boost) 
from a frame S to a frame S’ which is moving with speed u with respect to S along the 
common «x-axis. Then the energy E and momentum py of a particle in S are transformed 
to E’ and p, in S’ where (cf (D.1)) 


E’ = cosh’? E—sinh? pz (4.36) 
p, = coshd p; — sinh E, (4.37) 


where cosh = (1 — u?)~!/? = q(u), and sinh = y(u)u. As before, we start with an in- 
finitesimal transformation, where V is replaced by 7, such that cosh ns ~ 1 and sinh ns & ne. 
Then (4.36) and (4.37) become E’ = E — NsPa, Pl, = Px — NE. For the general infinitesimal 
boost parametrized by 7 = (Nz, Ny, Nz), the transformation law for (F, p) is 


E = E-n- p (4.38) 
p = p-nE. (4.39) 


Once again, we have to determine ¢’ and x’ such that the transformed versions of (4.14) 
and (4.15) are 


(E'-o-p') = my (4.40) 
(E’+a-p')y’ = mg. (4.41) 


Note that this time EF does transform, according to (4.38). 
The required ¢ and y’ are 


$ =(l-a-n/2)¢, xX = (1+ a-/2)x. (4.42) 


The spinors ¢ and x behaved the same under rotations, but they transform differently under 
boosts. There are two kinds of 2-component spinors, ¢-type and x-type, in the representation 
(3.40), which are distinguished by their behaviour under boosts. The group theory behind 
this will be explained in appendix M of volume 2. For the moment, we simply note that 
these 2-component spinors, each with well-defined Lorentz transformation properties, are 
called Weyl spinors (Weyl 1929). 

To verify the rule (4.42), take equation (4.14) in the form (4.40) and multiply from the 
left by the matrix (1 + ø - 7/2), to obtain 


(l+o-/2)(E-o-p)g=my', (4.43) 

or equivalently 
(l+o-7/2)(B-o-p)lt+ta-n/2)¢' = mx’, (4.44) 
where we have used (1 — ø -7/2)~! ~ (1 +ø -n/2). For (4.44) to be consistent with (4.40) 


we require 
(l+o-n/2)(E-o-p)\(l+o-n/2)=E'-o-p’. (4.45) 
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Keeping only first-order terms in 77, the left-hand side of (4.45) is 


1 
E-o-p+Eo-n-sZ(o- po nt+o-no-p) (4.46) 
=E-1n-p-—oa-(p—nE) (4.47) 
=F -o.p (4.48) 


as required for the right-hand side of (4.45). 
For a finite boost ¢ and x transform by the ‘exponentiation’ of (4.42), namely 


¢' = exp(—o : 39/2) ġ, x = explo - 9/2) x (4.49) 


where the three real parameters 0 = (y, Vy, Ùz) specify the direction and magnitude of the 
boost. In contrast to (4.28), the transformations (4.49) are not unitary. If we denote the 
matrix exp(—o - 9/2) by B, we have B = B' rather than B~' = Bt. So B does not leave 
¢'@ and y'y invariant. Actually this is no surprise. We already know from section 4.1.2 
that the density ¢'¢ + y'y ought to transform as the fourth component p of the 4-vector 
j” = (p, j). Let us check this for our infinitesimal boost: 


p! = big + xy! 
= g(L-o-n/2)(L—o-/2)o+x'(L+o-n/2)(1+o-/2) x 
= oo+xlx-glog-n+xlox-n 
= p-1 3 (4.50) 


as required by (4.38). Similarly, it may be verified (problem 4.2(b)) that j transforms as 
the 3-vector part of the 4-vector j”, under this infinitesimal boost. 

On the other hand, the products ¢'y and y'¢ are clearly invariant under the transfor- 
mation (4.49), since the exponential factors cancel. This means that the quantity wt Gw is 
a Lorentz invariant (note the form of 8 in (3.40)). 

At this point it is beginning to be clear that a more ‘covariant-looking’ notation would 
be very desirable. In the case of the KG probability current, the 4-vector index u was 
clearly visible in the expression on the right-hand side of (3.20), but there is nothing similar 
in the Dirac case so far. In problem 4.3 the four ‘y matrices’ are introduced, defined by 
yë = (42, y) with 7° = 6 and y = Ga, together with the quantity y = yty}, in terms of 
which the Dirac p of (3.51) and j of (3.57) can be written as ob(x)7°w(x) and w(x), yv(2), 
respectively. The complete Dirac 4-current is then 


j” = V(2)7"4(2). (4.51) 


For free particle solutions, we (and problem 4.2) have established that j” of (4.51) indeed 
transforms as a 4-vector under infinitesimal rotations and boosts. We have also just seen 
that the quantity py) is an invariant. 

We end this section by illustrating the use of the finite boost transformations (4.49). 
Consider two frames S and S$’, such that in S a particle is at rest with E = m, p = 0, and 
with spin up along the z-axis; in S’, the particle has energy E’, momentum p’ = (0,0, p’), 
and spin up along the z-axis. If we apply a boost such that S” has velocity (0,0, —v’) relative 
to S, where v’ = p'/E’, then E and p become 


E’ cosh VE = my(v’) (4.52) 
p = sinh E = mv'y(v’) (4.53) 
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as required. Now consider the forms of the 4-spinors in S and S’. In S, from (4.14) and 
(4.15) we have simply ¢ = x, and if we normalize such that iu = 2m we may take 


us=vm( $t) d=( 9): (4.54) 


In S’ the spinor is 


b+ b+ 
selaa Ee 8 


where the normalization N is determined (since ŭu is invariant) from the condition tg-ug: = 
2m to be N = (E’ + p')!/?, giving 


_( +e? ga 
us = ( (E' — pl)i/2 b+. ) ‘ (4.56) 


1 


But we can also calculate ug, by applying the transformation (4.49) with tanh Y” = —v' to 
us. Then the upper two components become 
p = Vm e976, = ym Poy, (4.57) 
while the lower two components become 
x = vm e? oh. (4.58) 
Now we can write 
; , E' +p”? 
e” /2 = (e? )1/2 = (cosh W + sinh ¥’)!/? = (==) (4.59) 
m 
and ‘in 
+ E' — f 
e7? /? — (=Z) ; (4.60) 
m 


and so we recover (4.56). 


4.2 Discrete transformations: P, C, and T 


The transformations we considered in section 4.1 are known as ‘continuous’, because the 
parameters involved (angles, speeds) vary continuously. This is essentially the reason we 
were able to build up finite transformations from infinitesimal ones, which differ only slightly 
from the identity transformation; finite transformations could be reached continuously from 
the identity. But there is another class of transformations, called ‘discrete’, which cannot be 
reached continuously from the identity. Examples of discrete transformations are parity (or 
space inversion), charge conjugation, and time reversal, and their combinations. Although 
these discrete transformations are important primarily in weak interactions, which we shall 
not cover until the second volume, it is useful to discuss the behaviour of Dirac wavefunctions 
under discrete transformations at this stage. Among other things, more light will be cast 
on antiparticles. 
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4.2.1 Parity 


The parity (or space inversion) transformation P is defined by 


P:a>a@'=-a2, tt; (4.61) 


that is, P inverts the spatial coordinates. It follows that P also inverts momenta (p > —p) 
but does not change angular momenta (a x p — x x p) or spin (o —> ø). We already see 
that there are two kinds of 3-vectors: polar 3-vectors which change sign under P and axial 
vectors which do not. For example, the electric field E and the vector potential A are polar 
vectors, while the magnetic field B is an axial vector. There are also scalar quantities (such 
as x- p) which do not change sign under P, and pseudoscalar quantities (such as ø - p) 
which do. 

Consider first the KG equation (4.1). Since A is a polar vector, it changes sign under 
parity, as does V, while both 0/0t and A? remain the same. The scalar products ô, A” 
and A“d,, are therefore invariant under parity, as are O and A. Hence we may identify 
p(z’) = (x), or equivalently 


p(x) = o(—a) = Pod (a), (4.62) 


where Po is the coordinate inversion operator. Note that we are calling the transformed 
wavefunction @p rather than yet another ¢’ since we need to keep track of what transfor- 
mation we are considering. If we take ¢(a) to be a positive-energy free particle solution with 
energy E and momentum p, ¢p will describe a positive energy particle with momentum 
—p, as we expect. 
Now let us study the covariance of the free particle Dirac equation 

jue.) =—ia- Vy (x,t) + Bm (a, t) (4.63) 
under P. Equation (4.63) will be covariant under (4.61) if we can find a wavefunction 
wp(a’,t) for observers using the transformed coordinate system such that their Dirac equa- 
tion has exactly the same form in their system as (4.63): 


jour (a’, t) = —ia- V'vp(2", t) + Bmp (za’,t). (4.64) 
Now we know that V’ = —V, since x’ = —a. Hence (4.64) becomes 

Owp 1 i: Fi / 

im (a’,t) =ia-Vup(ax',t) + Bmypp(a’,t). (4.65) 


Multiplying this equation from the left by 6 and using Ga = —aß, we find 


$ [Ayp (#',0)] = -ia VIsup(a',1)] + Blue (a0) (4.66) 


Comparing (4.66) and (4.63), it follows that we may consistently translate between w and 
wp using the relation 

plx, t) = Byp(—z,t), (4.67) 
or equivalently 


dp (a, t) = by(-x, t) = 6Pov(a, t). (4.68) 


Equation (4.68) is the required relation between the wavefunctions in the two systems; it 
may be compared to (4.4) and (4.62). 
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In principle we could include an arbitrary phase factor np on the right hand of (4.68) 
and (4.62); such a phase leaves the normalization of ¢ and yw), and all bilinears of the form 
w (gamma matrix) w unaltered. The possibility of such a phase factor did not arise in the 
case of Lorentz transformations, since for infinitesimal ones the transformed y’ and the orig- 
inal w differ only infinitesimally (not by a finite phase factor). But the parity transformation 
cannot be built up out of infinitesimal steps—the coordinate system is either reflected or it 
is not. We will choose 7p = 1. 

As an example of (4.68), consider the free particle solutions in the standard form (3.41), 
(3.72): 


y(x, t) =N ( oD, ) exp(—iEt+ ip: æ). (4.69) 
+m 


Then 
pplz, t) = Bw(—a,t) = N ( = ) exp(—iEt — ip- x) (4.70) 
E+m $ 


which can be conveniently summarized by the simple statement that the three-momentum 
p as seen in the parity transformed system is minus that in the original one, as expected. 
Note that ø does not change sign. 

It is also interesting to look at the behaviour of the spinors ¢ and y in the representation 
(3.40), where they satisfy the equations (4.14) and (4.15). Under parity p + —p, so we can 
immediately see that dp = x and yp = ¢. Thus the 2-component spinors ¢ and x are (in 
this representation) interchanged under parity. We may also consider the massless limit of 
equations (4.14) and (4.15). Since E = |p| for a massless particle, we see that the massless 
Weyl spinor ġo has helicity +1 (referring to (3.68)), while the massless Weyl spinor yo has 
helicity —1. For a massless particle, helicity is a Lorentz invariant quantity, since we cannot 
reverse the direction of motion of a particle moving with the speed of light. In the original 
form of the Standard Model (SM), it was assumed that neutrinos were massless, with only 
one helicity, and could therefore be described by massless Weyl spinors. We now know that 
neutrinos are not massless, but their very small masses make detection of the ‘other’ helicity 
very difficult, as will be discussed in section 20.2.2 of volume 2. 

The analysis leading to (4.68) may be extended to the case of the Dirac equation (3.102) 
for a particle of charge q in the field A“. As already noted, A is a polar vector, transforming 
under like x or V; the scalar potential A? is invariant under parity. The combination 
(—iV — qA) therefore changes sign under parity, and the manipulations following (4.65) 
proceed as before. 

We may introduce a corresponding parity operator P, which is unitary and acts on 
wavefunctions so as to change w into wp; then 


so that : . 
P = Po. (4.72) 
Applying P twice, we find 
P?y(a,t) = y(x, t) (4.73) 


which implies that the eigenvalues of P are +1. 

For example, the positive energy rest-frame spinors ((3.73) with p = 0)) are eigenstates 
of P with eigenvalue +1, and the negative energy rest-frame spinors are eigenstates of Ê 
with eigenvalue —1. Such rest-frame eigenvalues of P are called intrinsic parities. The corre- 
spondence between negative energy solutions and antiparticles, discussed in the preceding 
section, then suggests that a fermion and its antiparticle have opposite intrinsic parity 
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(note that the parity eigenvalue is multiplicative). We shall be able to derive this result 
after quantization of the Dirac field in chapter 7. . 

As usual in quantum mechanics, we may consider the action of P on operators as well 
as wavefunctions. In particular, the parity transform of a Dirac Hamiltonian H(a) will be 


PH(x)Pt = pÊ A(x) Pie. (4.74) 


If the Hamiltonian is invariant under parity, the right-hand side of (4.74) will equal H and 
the operator P will commute with H ; the eigenvalue of P will then be conserved. The 
reader may easily check that the Hamiltonian for the charged particle in a field A” is parity 
invariant, using P, AP}, =—A. 

With the rule (4.68) in hand, we can examine how various bilinear covariants, such as 
wy or pyth, transform under parity. For example, 


vp (a, t)up (es t) z pl (a, t)BBBY(a, t) = p(x, t)y(a, t), (4.75) 
showing that ọọ is a scalar. Similarly, for a 4-vector 
vt (æ, t) = (v°(a,t), v(æ,t)) = y(x, t) y(æ,t), (4.76) 


the reader may check in problem 4.4(a) that v° is a scalar and v is a polar vector. 
More interesting possibilities emerge when we introduce a new y-matrix, y5, defined by 


ys = iy yy’. (4.77) 


This matrix has the defining property that it anticommutes with the 4” matrices: 


{75,7"} = 0. (4.78) 
Consider now the quantity p(a, t) = W(a, t)ys5~(a, t). We find 
wp (x’, t)ysvp (a, t) = yt (x, t) Bys Bb (a, t) = —U(a, t)u(a, t), (4.79) 


so that p(x,t) is a pseudoscalar. Similarly, the reader may verify in problem 4.4(b) that 
the quantity a” (x,t) = o(#,t)y“75v(a,t) transforms under (infinitesimal) rotations and 
boosts as a 4-vector, but that under parity a°(a,t) is a pseudoscalar and a(a, t) is an axial 
vector. 

Matrix elements formed from v” and a” would have to be Lorentz invariant, of the form 


u,v", apa”, or v,a". For the first of these, we find (shortening the notation) 


vppvs = vv? — (—v) + (—v) = u,v", (4.80) 
and similarly ap af = apa”. Thus both of these matrix elements are scalars, taking the 
same form in both systems. However, this is not true of v,a”: 


Up,ap = v"(—a°) — (—v)- (a) = ~va”, (4.81) 


showing that this quantity is a pseudoscalar, changing sign when we change systems. By 
itself, such a sign change would be irrelevant, since observables will depend on the modulus 
squared of the matrix element. If, however, the matrix element for a process has the form 
(Up — a,)(v" — a"), for example, where both scalar and pseudoscalar parts are present, then 
the physics in one coordinate system and in the parity-transformed system will not be the 
same. One says “parity is violated” and only one of the systems can represent the real world; 
parity is conserved if physics in the two coordinate systems is the same. 
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Lee and Yang (1956) were the first to point out that, while there was strong evidence for 
parity conservation in strong and electromagnetic interactions, its status in weak interac- 
tions was at that time untested. They proposed that a clear signal of parity violation could 
be found in weak decays from initially polarized states (i.e. < s >Æ 0): if the distribution 
of final state particles depends on odd powers of the cosine of the angle between the initial 
spin direction and the final momentum, then parity is violated (note that < s >-pisa 
pseudoscalar). The first experiment to demonstrate parity violation was performed by Wu 
et al. (1957), using the -decay of polarized ®°°Co. Lee and Yang (1956) also remarked that 
parity violation in the decay 

amt — pt ty, (4.82) 


implies that the spin of the muon will be polarized along the direction of its momentum, 
and furthermore that the angular distribution of positrons in the subsequent decay 


ut set +o, + re (4.83) 


would (as in the °°Co experiment) serve as an analyser. This suggestion was quickly con- 
firmed by Garwin et al. (1957) and by Friedman and Telegdi (1957); in the rest frame of the 
pion, the u* spin is aligned opposite to its momentum, a situation that would be reversed 
in the parity transformed frame. 

The end result of many years of research was to establish that the currents responsible 
for weak interactions of quarks and leptons have precisely the ‘v“ — a”’ structure, leading 
to the observed parity violation (see volume 2). 


4.2.2 Charge conjugation 


Dirac’s hole theory led him to the remarkable prediction of the positron, and suggested a new 
kind of symmetry: to each charged spin-1/2 particle there must correspond an antiparticle 
with the opposite charge and the same mass. Feynman’s interpretation of the negative 
energy solutions of the KG and Dirac equations assumes that this symmetry holds for 
both bosons and fermions. We now explore the idea of particle-antiparticle symmetry more 
formally. 

We begin with the KG equation for a spin-0 particle of mass m and charge q in an 
electromagnetic field A“, namely equation (4.1). Inspection of this equation shows at once 
that the wave function ¢c of a particle with the same mass and charge —q is related to the 
original wavefunction ¢ by 

dc = nc" (4.84) 


where 7c is an arbitrary phase factor which we shall take to be unity. Equation (4.84) 
tells us how to connect the solutions of the particle (charge q) and antiparticle (charge —q) 
equations. When applied to free-particle solutions of the KG equation, the transformation 
(4.84) relates positive and negative 4-momentum solutions, as expected in the Feynman 
interpretation of the latter. 

We may extend the transformation (4.84) to a symmetry operation for the KG equa- 
tion (4.1) if we introduce an operation which changes the sign of A“. Then the combined 
operation ‘take the complex conjugate of ¢ and change A” to — AF’ is a formal symmetry 
of (4.84), in the sense that the wavefunction ¢* in the field — A” satisfies exactly the same 
equation as does the wavefunction ¢ in the field A”. Of course, we have just seen that * is 
the antiparticle wavefunction, so it is no surprise that the dynamics of the antiparticle in a 
field — A” is the same as that of the particle in a field A”. Still, this is symmetry of the KG 
equation, which we will call charge conjugation, denoted by C: 


C:¢>¢c=¢, A! > Ab =-A". (4.85) 


Discrete transformations: P, C, and T 83 


We can ask: how does the electromagnetic current behave under this transformation? The 
expression for the KG current is found by multiplying the free-particle probability current 
by the charge q, and by replacing 0“ by the gauge-invariant operator D” = O” +iqA". This 
leads to 


jka eml A") = igf@*(d" + ig A") — [(0" + igA")9]"o} 


= iq" — (O"¢*)d] — 2q? A" "Q. (4.86) 
The current for ġc, A% is then 
jka em(?e, 4G) = iglde0"be — (0"¢6)¢ce] — 20° Aboto 
= igld 0*d* — (0"9)G*] +20 A" o* 
= -jke em(®, Af). (4.87) 


As we would hope, the KG current changes sign under C. 
Now consider the Dirac equation for a particle of mass m and charge q in a field A“, 
which we write in the form 


= (—a - V + iqa : A — ibm — igA°)y. (4.88) 


We want to relate solutions of this equation to the solution we of the same equation with 
q replaced by —q. As in the KG case, we begin by writing down the complex conjugate 
equation, 
Ow* 
ot 


= (—a,0! + a20? = a30’ 
—iqa10! + iqa28° — iqa30? + ibm + igA®)y* (4.89) 


where we have used the fact that a,, a3, and 8 are real and a2 is pure imaginary, which is 
the case in both the standard representation of the Dirac matrices, and the representation 
(3.40). Now imagine multiplying (4.89) from the left by a matrix c, with the properties that 
it commutes with a, and a3, but anticommutes with a2 and 8. Then (4.89) will become 
Oy . . -40 * 

ca = (-a:V —iqa:- A — ibm + iqA”) ce (4.90) 
which is just (4.88) with q replaced by —q. So we may identify the charge-conjugate Dirac 
wavefunction as 


vc = nc cy" (4.91) 
where ņc is the usual arbitrary phase factor. The required c is 
c= ba = 7? (4.92) 


as the reader may easily verify. It is customary to choose nc = i, and so finally the connection 
between Yc and w is 


polz) = Coy“ (x), where Co = i7’. (4.93) 


Let us look at the effect of the transformation (4.93) on free-particle solutions of the 
Dirac equation. Referring to (3.73) we find that a positive energy spinor is transformed to 


E+m 


uc(p, s) = (E TT m)\/? iy? ( m o* 


= (B+ my? Palia) ) (4.94) 


—iood** 
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where we have used 03 = —02, 0201 = —0103, and 0203 = —0302. The 4-spinor (4.94) is a 
negative energy solution v(p, s) as in (3.82), identifying —iog¢** with y*. Accordingly we 
have shown that 

uc(p, s) = u(p, s). (4.95) 


Similarly, as the reader may check, 


voa(p, s) = iy?u*(p, s) = u(p, 8). (4.96) 


So from a positive energy free-particle spinor associated with 4-momentum p and spin s the 
transformation (4.93) produces a negative energy free-particle spinor associated with the 
same 4-momentum and spin, and vice versa: that is, u and v are charge-conjugate spinors. 

At this point we may wonder if it is possible to construct a self-conjugate 4-spinor. Such a 
spinor would be appropriate for a fermionic particle which is the same as its antiparticle— 
that is, for a Majorana fermion, so named after Ettore Majorana who first raised this 
possibility (Majorana 1937). To pursue this idea, it is convenient to use the representation 
(3.40) for the Dirac matrices again, in order to keep track of the Lorentz transformation 
property of the Majorana spinor. Consider the 4-spinor 


oe ( e y ) , (4.97) 


; 7 0 —i f 
WMC = iy wš — ( ai a ) ( Ca ) = ( me ) = WM; (4.98) 


so that indeed wy, is self-conjugate. The Lorentz transformation property of wy is consistent, 
since we may easily show (problem 4.4(c)) that the 2-spinor o2¢* transforms as a x-type 
spinor. The reader can construct a similar self-conjugate 4-spinor using y rather than ¢. 

A self-conjugate fermion has to carry no distinguishing quantum number, such as elec- 
tromagnetic charge. The only known neutral fermions are the neutrinos, and until quite 
recently it was assumed that they are Dirac fermions, with distinct antiparticles (the rel- 
evant distinguishing quantum number being lepton number). However, as we shall see in 
volume 2, owing to their very small mass, it is hard to discriminate between the two possi- 
bilities (Majorana and Dirac) for neutrinos, and a definitive answer will have to await the 
result of a crucial experiment, the search for neutrinoless double beta decay, which is only 
possible for Majorana neutrinos. 

Returning to more conventional matters, we extend (as in the KG case) the transforma- 
tion (4.93) to a formal symmetry of the Dirac equation by including the sign change of A”, 
so that C for the Dirac equation is 


Then 


C: bo vc =i y, AX > —A". (4.99) 


We now examine how the electromagnetic current behaves under C in the Dirac case. The 
Dirac charge density is the probability density yọ multiplied by the charge q, and the 
electromagnetic 3-current is the probability current &'ay multiplied by q: 


55 em = (aty, qt ay) = aby". (4.100) 


Consider the charge density; under the transformation (4.93) this becomes 


hbe = ay Pty = qabba = qy. (4.101) 
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In terms of the four components of Y, the product YT y* is divi + vos + u3v3 + dav}. 
These components are ordinary functions which commute with each other, so yTy* = 
wtp = wid; hence 

qebe = ably (4.102) 


and the charge density does not change sign under C. Similarly, one finds that the electro- 
magnetic 3-current does not change sign either. 

These results can be interpreted in the hole theory picture: the current due to a physical 
positive energy antiparticle of charge q and momentum p is regarded as the same as that 
of a missing negative energy particle of charge —q and momentum p. Our charge conjuga- 
tion operation explicitly constructs the positive energy antiparticle wavefunction from the 
negative energy particle one. 

Yet this is not really what we want a true charge conjugation operator to do — which 
is, rather, to change a positive energy particle into a positive energy antiparticle. The same 
inadequacy was true in the KG case also. There is no way of representing such an operation 
in a single particle wavefunction formalism. The appropriate formalism is quantum field 
theory, in which y(x) becomes a quantum field operator (as do bosonic fields), and there is 
a unitary quantum field operator C with the required property. We shall see in chapter 7 
that fermionic operators anticommute with each other, and that this is just what is needed 
to ensure that the current changes sign under GC. Bosonic fields, on the other hand, obey 
commutation rather than anticommutation relations, and this safeguards the change in sign 
of the bosonic current. 

We have approached charge conjugation following the historical route, which is to say 
via the electromagnetic interaction. But we can ask whether (true) C is a good symmetry 
of other interactions, for example the weak interaction. Consider applying C to the reaction 
(4.82), so that it becomes 

T +> +D,. (4.103) 


If C was a good symmetry, the (parity-violating) longitudinal polarization of the ~~ in 
(4.103) should be the same as that of the u* in (4.82). But in fact it is the opposite, the u7 
spin being aligned along the direction of its momentum. So C, like P, is violated in weak 
interactions. It is a good symmetry in electromagnetic and strong interactions. 


4.2.3 CP 


It has probably occurred to the reader that, although C and P are each violated in the 
decays (4.82) and (4.103), the combined transformation CP might be a good symmetry: 
particles are changed to antiparticles, the sense of longitudinal polarization is reversed, 
and the corresponding decays occur. Indeed, the rates for these two decays are the same, 
and CP is conserved. For a while, after 1956, it was hoped that CP would prove to be 
always conserved, so as to avoid a ‘lopsided’ distinction between right and left, and between 
matter and antimatter. But before long Christenson et al. (1964) reported evidence for CP 
violation in the decays of neutral K-mesons, a result soon confirmed by other experiments. 

As we mentioned in section 1.2.2, it was the difficulty of incorporating CP violation into 
the 2-generation electroweak theory that led Kobayashi and Maskawa (1973) to propose a 
third generation of quarks, which allowed a CP violating parameter to be included quite 
naturally. CP violation in K-decays is a small effect (of order one part in 10%), but in 
1980 Carter and Sanda (1980) showed that considerably larger effects, up to 20%, could be 
expected in rare decays of neutral B mesons, according to the framework of Kobayashi and 
Maskawa (KM). Some 20 years later, the “B factories” at the asymmetric e~e* colliders 
PEPII and KEKB began producing B mesons by the many millions, and intensive study of 
CP violation in the B°(db) — B°(db) systems followed at the BaBar and Belle detectors. 
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Remarkably, all observations to date are consistent with the original KM parametrization. 
We shall return to this topic when we discuss weak interactions in volume 2, specifically in 
chapter 21. Meanwhile we refer to Bettini (2008), chapter 8, for an introductory overview. 
It is worth pausing here to note the significance of CP violation. First of all, it implies 
that there is an absolute distinction between matter and antimatter and, as a consequence, 
between left and right: these are not merely a matter of convention. For example, the rate 

for the process 
B? > Kr (4.104) 


is some 20% greater (Nakamura et al. 2010) than the rate for the CP-conjugate process 
B? > Kort. (4.105) 


(Note that the B° state is conventionally defined as the CP transform of the B° state). 
So the pion distinguished by being emitted in the higher-yielding reaction (4.104) defines 
“negatively charged”, and the polarization of the muon in its decay (4.103) defines what is 
a right-handed screw sense. 

Secondly, CP (and C) violation is one of the three conditions? established by Sakharov 
(1967) that would enable a universe containing initially equal amounts of matter and anti- 
matter, when created in the Big Bang, to evolve into the matter-dominated universe we see 
today—rather than simply having the required imbalance as an initial condition. Within 
the SM, all presently known CP violating effects are attributable to the KM mechanism. 
But calculations show (Huet and Sather 1995) that the matter-antimatter asymmetry gen- 
erated from this source is very many orders of magnitude too small. The observed baryon 
asymmetry is therefore unexplained, so far. However, it is also possible that CP violation 
may eventually be detected in the neutrino sector, as we shall discuss further in volume 2. 
The (rather esoteric) mechanism whereby this might lead to the observed baryon asymme- 
try is called “leptogenesis” (Fukugida and Yanagida 1986, Frampton et al. 2002, Pascoli et 
al. 2007a, 2007b). But establishing the viability of this solution to the problem is likely to 
be a long and slow process. 

Thirdly, CP violation is directly connected to the violation of another discrete symmetry, 
namely time reversal T, because very general principles of quantum field theory imply that 
the product CPT (in any order) is conserved—the CPT theorem. This theorem states 
(Liiders 1954, 1957, Pauli 1957) that CPT must be an exact symmetry for any Lorentz 
invariant quantum field theory constructed out of local fields, with a Hermitian Hamiltonian, 
and quantized according to the usual spin-statistics rule (integer spin particles are bosons, 
half-odd integer spin particles are fermions). Thus any violation of CP implies a violation 
of T if CPT is to be conserved. 

We shall return to CPT presently, but first let us deal with T. 


4.2.4 Time reversal 


The time reversal transformation T is defined by 
T:x>x' =r, tot =-+t; (4.106) 


that is, T reverses the direction of time. It follows that T reverses momenta (p — —p) and 
angular momenta (x x p —> —a# x p). Let us also note how the electromagnetic potentials 
transform under T: A? does not change, being generated by static charges, while A changes 
sign, since it is produced by currents; that is, 


AS) = A(t) A(t’) = —A(t). (4.107) 


2The other two are (a) the existence of baryon number violating transitions and (b) a time when the C, 
CP, and baryon number violating transitions proceeded out of thermal equilibrium. 
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It follows that the electric field Æ does not change sign under T, but the magnetic field 
B does. It is easily checked that these prescriptions ensure that the Maxwell equations are 
covariant under T. 

Consider first the behaviour of the KG equation for a particle of charge q in the field 
Ab: 


— 


+ m?)$(t) = —ig[d,,A"(t) + A” (t)0,Jo(t) + PALEE). (4.108) 


The equation in the time-reversed system is 


(O+m?)or(t’) = -igl ATE) + ATENO) + ATOT). (4.109) 
Using (4.107) we obtain 
a, AK (t) = —0, A"(t), AH (t), = —A*(t)0,, AFl) = A(t). (4.110) 


It follows that we can identify 
rlt) = o*(t) (4.111) 


up to an arbitrary phase factor, here chosen to be unity. If ¢ is a positive-energy free 
particle solution, ¢* represents a particle of positive energy in the time-reversed system, 
with momentum —p as expected. 
Now consider the behaviour under T of the Dirac equation for a particle of charge q in 

a field A“, 

y(t) _ 0 

a {a- [iV — qA(t)] + 8m + qA” (t) (t) (4.112) 
where we have suppressed the spatial coordinate arguments. In the time-reversed system, 
the corresponding equation is 


ovr (t’) 


ay = {a [FIV - qAr(t')] + Bm + qAT (t) or"). (4.113) 


To relate Yr to w we start by taking the complex conjugate of (4.112) so as to obtain 


“iG, = fot iV — qA] + Bm + GAP)" (t) (4.114) 
which we may rewrite as 
i ={a* [iV + gAr(t’)] + 8“m + gAr(t)}u"(@). (4.115) 


Now suppose a unitary matrix Ur exists such that 
Ura*Ul,=-a, Up6*ul = $; (4.116) 
then it is clear that the Dirac equation will be covariant under T with the identification 
va(t’) = Ury* (t). (4.117) 


In either of the two representations of the Dirac matrices which we have been using, a1, a3, 
and p are real, while az is pure imaginary; it follows that Up must commute with az and 
8, and anticommute with a, and ag. A suitable Ur is 


Ur = ia,a3 (4.118) 


where the phase is a conventional choice. 
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Let us check what is the effect of the transformation (4.117) on a positive-energy plane 
wave solution (3.74). In the representation (3.31) Ur is given by 


Up = ( ee ) (4.119) 


02 
and so 
yr(s, t) = (EB +m)? ( F ) ( op P y ) sot) 
= (E+ mir ( = a . ) Piir it) (4.120) 
Etm T20 
which is a positive-energy solution with the expected momentum p’ = —p, and with the 


transformed spinor wavefunction o2¢*. If we take ¢ to be a helicity eigenstate 


TPA 4.121 
Dl oN Pr ( ) 


where A = +1, then it follows that 


p' 


T j oF pgs = A020), (4.122) 


and the helicity is unchanged. N 
As in the case of parity, we may introduce an operator T which changes ¢ to r for the 
KG equation, and Y% to Yr for the Dirac equation. Then 


T(KG) = KT» (4.123) 


and y 2 
T (Dirac) = UpKT (4.124) 


where K is the complex conjugation operator, and To is the time coordinate reversal op- 
erator. The appearance of K is a general feature of time-reversal in quantum mechanics 
(Wigner 1964), and has important consequences. Because the transformations involve com- 
plex conjugation, the scalar product of two wavefunctions < 2|% > is not equal to the 
corresponding quantity < Yər|Yır >, as it would be in the case of parity, for example, or 
for any other transformation represented by a unitary operator. Instead, we have 


< dedi >=< Yor|yir >*. (4.125) 


Note, however, that the probability | < We|v1 > |? is still preserved. 
If we consider the matrix element of any operator Ô, then since Orv, is itself a wave- 
function, we must have 


< W2|Olyy >=< y2|Ov1 >=< por Ty >* =< por TÔT | dit >* (4.126) 


where TOT! is the operator in the time-reversed system. In particular, if we take O to 


3Complex conjugation also appeared in our discussion of C in section 4.2.2, but as indicated there 
the true operator C of quantum field is unitary. Even in quantum field theory, however, the time-reversal 
operator involves complex conjugation, as we shall see in section 7.5.3. 
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be a Hermitian interaction potential V, which is time-reversal invariant, then time-reversal 
invariance implies the relation 


< pal V lpi >=< Wor|V ie >*=< pir V opr > . (4.127) 


Now < y2|V |1 > is the amplitude for the state represented by pı to make a transition to 
the state represented by wz to first order in the potential V (see section M.3 of appendix M). 
Equation (4.127) therefore relates this amplitude to one for the inverse transition, involving 
time-reversed states. The relation in fact holds for the complete (all orders) transition 
operator T (see for example Lee 1981, section 13.5), and enables one to relate rates and 
cross sections for reactions and their inverses. 

For strong interactions, these relations are straightforward to test, and confirm that 
strong interactions are T-invariant. So are electromagnetic interactions. In weak interac- 
tions, where the violation of CP and the conservation of CPT implies that T is violated, 
it is generally very difficult if not impossible to set up the conditions for an inverse reaction 
to occur (consider the inverse of neutron decay, n + pei, for example). However, one 
such test is possible in neutral K-decays (Kabir 1970). We can check whether the rate for a 
particle tagged at its production as a K? to decay in a way that identifies it as a K? is equal 
to the rate for a particle tagged as K? at its production to decay in a way that identifies 
it as a K°. The experiment (Angelopoulos et al. 1998) showed a T-violating difference in 
these rates. The parameters determining these reactions had actually been well determined 
by other measurements; still, this was an independent and direct demonstration of T vio- 
lation. Evidence for T violation in B-meson transitions has been reported by Alvarez and 
Szynkman (2008), developing a test suggested by Banuls and Bernabeu (1999, 2000). 

We can also examine the behaviour of various bilinears under T. For example, the reader 
may easily check the results 


Prlz yrl) = payla),  de(2')psbr(2') = —¥(2)75¥(2). (4.128) 


Time reversal symmetry will be violated if the theory contains both even and odd amplitudes 
under T. An interesting example is provided by the amplitude 


—idep(x)o"” y5b(t) Fuv, (4.129) 
where . 
ee: 


3 (o = wat) (4.130) 


and where F;,, is an external electric field with non-vanishing components Fo; = EŻ. In the 
representation (3.31), 


ot” 


i š Oi 0 : 
ays =i ( 0 o; ) =idi, (4.131) 
and (4.129) reduces to 7 
det)(x) Uw (a) - E. (4.132) 


Problem 4.5 shows that the quantity (4.132) is odd under T, and it is easy to check that it 
is also odd under P. A non-zero value of such a term would correspond to an electric dipole 
moment for a spin-1/2 particle (compare the analogous quantity dmY(x)&y(x)- B for the 
magnetic dipole moment, which is even under P and T). Experiment places very strong 
limits on possible electric dipole moments (Workman et al. 2022) for the neutron, proton, 
and electron: 


da < 0.18 x 107% ecm. (4.133) 


dp < 0.021 x 107” e cm. (4.134) 
de < 0.11 x 10778 e cm. (4.135) 
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Although these numbers seem tiny, calculations of the dn in the SM produce a result some 
6 or 7 orders of magnitude smaller than (4.133). However, these experimental limits impose 
strong constraints on theories which go beyond the SM, and which may typically contain 
the possibility of larger T and CP violating effects. 


4.2.5 CPT 


We denote the product CPT by @ and the corresponding operator by Ô. As already men- 
tioned, for any conventional quantum field theory, and certainly for the SM, the transfor- 
mation @ is an invariance of the theory. One immediate consequence of this invariance is 
the equality of particle and antiparticle masses. This is easily demonstrated. Let |X, sz > 
be the state of a particle X at rest with z-component of spin equal to s,. The mass of X is 
given by the expectation value 


Mx =< X,s,|H|X, s- >, (4.136) 


where H is the total Hamiltonian. Clearly Mx is real, and independent of s,. Now the 


operator Ê involves T, and therefore we must be careful to use (4.126) rather than the 
usual rule for unitary operators. So from (4.126) we have 


és La «wT se 
Mx =< X,s-|H|X,s, >*=<X,s.|0 6H6@ 6|X,s. > (4.137) 


If the Hamiltonian is CPT invariant, then 646 =H. Also, we know the action of P, Ĉ, 
and T on the states, from the previous results. Equation (4.137) then becomes 


Mx =< X, —s,|H|X,—s, >= Mg, (4.138) 


stating the equality of particle and antiparticle masses. The most sensitive test of (4.138) is 
provided by the K? — K? system, where the currently quoted limit for the mass difference 
is (Workman et al. 2022) 


Me — M| 


<6x 10719 at 90% C.L. (4.139) 
Mawerage 


@-invariance also implies that the charges of a charged particle and its antiparticle are 
equal in magnitude but opposite in sign, as are their magnetic moments; and in the case of 
unstable particles it implies that their lifetimes are equal, to first order in the interaction 
responsible for the decay (Lee 1981). All current data support these equalities (Workman 
et al. 2022). 


ESS] 
Problems 
4.1 Consider an infinitesimal boost along the z-axis, 
t = t-nr (4.140) 
g = x—nt. (4.141) 
Show that the KG wavefunction transforms according to 


¢'(z,t) = (1+ ink 2)¢, (4.142) 
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where 


K, = —i x 0/dt —i t 0/dz. (4.143) 


Defining similar operators Ky, K, for boosts in the y and z directions, show that 
(Ke, Ky] = —ibz. (4.144) 
4.2 In this problem, use the representation (3.40) for the Dirac matrices, as in section 4.1.2. 


(a) Using the rule (4.19) for the transformation of the spinor ¢ under an infinitesimal 
rotation of the coordinate system, verify that ¢'o@ transforms as a 3-vector. 
[Hint: you need to show that d''o = tog — e x diad; use the results of 
problem 3.4(a).] Show also that the free-particle Dirac probability current density 
is a 3-vector. 


(b) Using the rule (4.42) for the transformation of ¢ and x under an infinitesimal 
boost, verify that j = ¢'o¢ — x'ox transforms as the 3-vector part of the 4- 
vector (p, j). [Hint: you need to show that 7’ = j — np.] 


(a) Defining the four ‘y matrices’ 


y =.) 


where y° = 8 and y = Ga, show that the Dirac equation can be written in the 
form (i7“0, —m)y = 0. Find the anti-commutation relations of the y matrices. 
Show that the positive energy spinors u(p, s) satisfy (p — m)u(p, s) = 0, and that 
the negative energy spinors v(p, s) satisfy (p + m)v(p,s) = 0, where p = “pu 
(pronounced ‘p-slash’). 
(b) Define the conjugate spinor g 
p(z) = yt (x). 
and use the previous result to find the equation satisfied by w in y matrix notation. 


(c) The Dirac probability current may be written as 


j” = y(x)" ya). 
Show that it satisfies the conservation law 


d,j" =0. 


(a) Verify that, under P, w(a,t)y°v(a@,t) is a scalar, and that (a, t)yw(a,t) is a 
polar vector. 


(b) Verify that a” (x,t) = (a, t)y"y50)(a, t) transforms under infinitesimal rotations 
and boosts as a 4-vector; and that under P a°(a) is a pseudoscalar, and a(æ, t) 
is an axial vector. 


(c) Show that o2¢* transforms under rotations and boosts as a x-type spinor, and 
that o2\* transforms as a ¢-type spinor. 


4.5 Verify that (a, t)©¢(a, t) - E of (4.132) is odd under T. 
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4.6 The Galilean transformation (non-relativistic boost) is defined by 
a’ —-ax-—vt, t =t. 


Show that the free-particle time-dependent Schrödinger equation is covariant under this 
transformation if the wavefunction transforms according to the rule ~'(a’, t’) = expli f (a, t)] 
(x,t), where f(x,t) satisfies the condition 

Of 1 

->v Vf+w V= Vf) 

T f am, YP 
Find constants a and b such that the function f = at + b - æ satisfies this condition. Show 
that the resulting transformation rule is consistent with the way you expect a plane wave 
solution to transform. 


|! g?f- vf. V. 
2m m 
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It was a wonderful world my father told me about. 

You might wonder what he got out of it all. I went to MIT. I went to Princeton. I went 
home and he said, ‘Now you’ve got a science education. I have always wanted to know 
something that I have never understood; and so, my son, I want you to explain it to me.’ I 
said yes. 

He said, ‘I understand that they say that light is emitted from an atom when it goes 
from one state to another, from an excited state to a state of lower energy.’ 

I said ‘That’s right.’ 

‘And light is a kind of particle, a photon I think they call it.’ 

‘Yes.’ 

‘So if the photon comes out of the atom when it goes from the excited to the lower state, 
the photon must have been in the atom in the excited state.’ 

I said, ‘Well, no.’ 

He said, ‘Well, how do you look at it so you can think of a particle photon coming out 
without it having been in there in the excited state?’ 

I thought a few minutes, and I said, ‘I’m sorry; I don’t know. I can’t explain it to you.’ 

He was very disappointed after all these years and years trying to teach me something, 
that it came out with such poor results. 


R P Feynman, The Physics Teacher, vol 7, No 6, September 1969 


All the fifty years of conscious brooding have brought me no closer to the answer to 
the question, ‘What are light quanta?’ Of course today every rascal thinks he knows the 
answer, but he is deluding himself. 


A Einstein (1951) 
Quoted in ‘Einstein’s research on the nature of light’ 
E Wolf (1979), Optic News, vol 5, No 1, page 39. 


I never satisfy myself until I can make a mechanical model of a thing. If I can make a 
mechanical model I can understand it. As long as I cannot make a mechanical model all the 
way through I cannot understand; and that is why I cannot get the electromagnetic theory. 


[Sir William Thomson, Lord Kelvin, 1884 Notes of Lectures on Molecular Dynamics and 
the Wave Theory of Light delivered at the Johns Hopkins University, Baltimore, steno- 
graphic report by A S Hathaway (Baltimore: Johns Hopkins University) Lecture XX, 
pp 270-1.] 


5 
Quantum Field Theory I: The Free Scalar Field 


In this chapter we shall give an elementary introduction to quantum field theory, which is 
the established ‘language’ of the Standard Model (SM) of particle physics. Even so long after 
Maxwell’s theory of the (classical) electromagnetic field, the concept of a ‘disembodied’ field 
is not an easy one; and we are going to have to add the complications of quantum mechanics 
to it. In such a situation, it is helpful to have some physical model in mind. For most of 
us, as for Lord Kelvin, this still means a mechanical model. Thus in the following two 
sections we begin by considering a mechanical model for a quantum field. At the end, we 
shall—like Maxwell—throw away the ‘mechanism’ and have simply quantum field theory. 
Section 5.1 describes this programme qualitatively; section 5.2 presents a more complete 
formalism, for the simple case of a field whose quanta are massless, and move in only one 
spatial dimension. The appropriate generalizations for massive quanta in three dimensions 
are given in section 5.3. 


5.1 The quantum field: (i) descriptive 


Mechanical systems are usefully characterized by the number of degrees of freedom they 
possess: thus a one-dimensional pendulum has one degree of freedom, two coupled one- 
dimensional pendulums have two degrees of freedom—which may be taken to be their 
angular displacements, for example. A scalar field ¢(a,t) corresponds to a system with an 
infinite number of degrees of freedom, since at each continuously varying point x an in- 
dependent ‘displacement’ (x,t), which also varies with time, has to be determined. Thus 
quantum field theory involves two major mathematical steps: the description of continu- 
ous systems (fields) which have infinitely many degrees of freedom, and the application of 
quantum theory to such systems. These two aspects are clearly separable. It is certainly 
easier to begin by considering systems with a discrete—but possibly very large—number of 
degrees of freedom, for example a solid. We shall treat such systems first classically and then 
quantum mechanically. Then, returning to the classical case, we shall allow the number of 
degrees of freedom to become infinite, so that the system corresponds to a classical field. 
Finally, we shall apply quantum mechanics directly to fields. 

We begin by considering a rather small solid—one that has only two atoms free to 
move. The atoms, each of mass m, are connected by a string, and each is connected to 
a fixed support by a similar string (figure 5.1(a)); all the strings are under tension F. 
We consider small transverse vibrations of the atoms (figure 5.1(b)), and we call q,(t) 
(r = 1,2) the transverse displacements. We are interested in the total energy E of the 
system. According to classical mechanics, this is equal to the sum of the kinetic energies 
im? of each atom, together with a potential energy V which can be calculated as follows. 
Referring to figure 5.1(b), when atom 1 is displaced by q1, it experiences a restoring force 


Fı = Fsina— F sin 8 (5.1) 
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(a) (b) 


FIGURE 5.1 
A vibrating system with two degrees of freedom: (a) two mass points at rest, with the 
strings under tension; (b) a small transverse displacement. 


assuming a constant tension F along the string. For small displacements qı and q (i.e. 
qı,2 <1) we have 
sina = q /(Ê +)? x q/l (5.2) 
sin 8 = (q — m1) /[P + (a2 — ay}? ~ (a2 — a )/l 


where terms of order (qi,2/l)° and higher have been neglected. Thus the restoring force on 
particle 1 is, in this approximation, 


F, = k(2m — q2) (5.3) 


with k = F/l. Similarly, the restoring force on particle 2 is 


Fy = k(2q2 — qı) (5.4) 

and the equations of motion are 
mq, = —k(2q — q2) (5.5) 
mg = —k(2q2—- qı). (5.6) 


The potential energy is then determined (up to an irrelevant constant) by the requirement 
that (5.5) and (5.6) are of the form 


mq = —0V/0q (5.7) 
mde = —OV/0q2. (5.8) 

Thus we deduce that 
V =k} +4 — 41a). (5.9) 


Equations (5.5) and (5.6) form a pair of linear, coupled differential equations. Each of the 
italicized words is important. By ‘linear’, is meant that only the first power of qı and q2 and 
their time derivatives appear in the equations of motion; terms such as q7, qi 92, 97, q}, and 
so on would render the equations of motion ‘nonlinear’. This linear/nonlinear distinction is 
a crucial one in dynamics. Most importantly, the solutions of linear differential equations 
may be added together with constant coefficients (‘linearly superposed’) to make new valid 
solutions of the equations. In contrast, solutions of nonlinear differential equations—besides 
being very hard to find!—cannot be linearly superposed to get new solutions. In addition, 
nonlinear dynamical equations may typically lead to chaotic motion. 
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The notion of linearity/nonlinearity carries over also into the equations of motion for 
fields. In this context, an equation for a field ¢(a, t) is said to be linear if ¢ and its space—or 
time—derivatives appear only to the first power. As we shall see, this is true for Maxwell’s 
equations for the electromagnetic field and it is, of course, the mathematical reason behind 
all the physics of such things as interference and diffraction, which may be understood 
precisely in terms of superposition of solutions of these equations. Likewise the equations 
of quantum mechanics (e.g. Schrédinger’s equation) are all linear in this sense, consistent 
with the principle of superposition in quantum mechanics. 

It is clear, then, that in looking at simple mechanical models as a guide to the field 
systems in which we will ultimately be interested, we should consider ones in which the 
equations of motion are linear. In the present case, this is true, but only because we have 
made the approximation that qı and q2 are small (compared to l). Referring to equa- 
tion (5.2), we can immediately see that if we had kept the full expression for sina and 
sin 8, the resulting equations of motion would have been highly nonlinear. A similar ‘small 
displacement’ approximation has to be made in determining the familiar wave equation, 
describing waves on continuous strings, for example (see (5.29) later). Most significantly, 
however, quantum mechanics is believed to be a linear theory without any approximation. 

The appearance of only linear terms in qı and q2 in the equations of motion implies, 
via (5.7) and (5.8), that the potential energy can only involve quadratic powers of the q’s, 
i.e. q?, q2, and qıq2, as in (5.9). Once again, had we used the general expression for the 
potential energy in a stretched string as ‘tensionxextension’ we would have obtained an 
expression containing all powers of the q’s via such terms as {(I? + q?]!/? — 1}. 

We turn now to the coupled aspect of (5.5) and (5.6). By this we mean that the right- 
hand side of the qı equation depends on q2 as well as qi, and similarly for the q2 equation. 
This ‘mathematical’ coupling has its origin in the term —kq,q2 in V, which corresponds 
to the ‘physical’ coupling of the string BC connecting the two atoms. If this coupling were 
absent, equations (5.5) and (5.6) would describe two independent (uncoupled) harmonic 
oscillators, each of frequency (2k/m)!/?2. When we consider the addition of more and more 
particles (see later) we certainly do not want them to vibrate independently, otherwise we 
would not be able to get wave-like displacements propagating through the system. So we 
need to retain at least this minimal kind of ‘quadratic’ coupling. 

With the coupling, the solutions of (5.5) and (5.6) are not quite so obvious. However, a 
simple step makes the equations much easier. Suppose we add the two equations so as to 
obtain 

m(qi + G2) = —k(qi + 42) (5.10) 


and subtract them to obtain 


m(qi — G2) = —3k(q1 — q2). (5.11) 


A remarkable thing has happened. The two combinations qı + q2 and qı — q2 of the orig- 
inal coordinates satisfy uncoupled equations—which are of course very easy to solve. The 
combination qı + q2 oscillates with frequency w; = (k/m)!/?, while qı — q2 oscillates with 
frequency wz = (3k/m)'/?. 

Let us introduce 


Qı = (a + @)/V2 Qə = (qı — q&2)/ V2 (5.12) 
(the \/2’s are for later convenience). Then the solutions of (5.10) and (5.11) are: 


Qi(t) = Acoswit+ Bsinw,t (5.13) 
Qo(t) = Ccoswet + Dsinwyt. (5.14) 
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(a) (b) 


FIGURE 5.2 
Motion in the two normal modes: (a) frequency w1; (b) frequency w2. 


Suppose that the initial conditions are such that 


i.e. the atoms are released from rest, at equal transverse displacements a. In terms of the 
Q,’s, the conditions (5.15) are 


Q2(0) = Q2(0) =0 


(5.16) 
Q1(0)=V2a = Qi (0) = 0. 


Thus from (5.13) and (5.14) we find that the complete solution, for these initial conditions, 
is 


Q(t) = V2acosuyt (5.17) 
Qt) = 0. (5.18) 


We see from (5.18) that the motion is such that qı = q2 throughout, and from (5.17) that the 
system vibrates with a single definite frequency w1. A form of motion in which the system 
as a whole moves with a definite frequency is called a normal mode or simply a ‘mode’ for 
short. Figure 5.2(a) shows two ‘snapshot’ configurations of our two-atom system when it is 
oscillating in the mode characterized by qı = q2. In this mode, only Qı (t) changes; Qo(t) is 
always zero. Another mode also exists in which qı = —q at all times: here Q(t) is zero and 
Qa(t) oscillates with frequency w2. Figure 5.2(b) shows two snapshots of the atoms when 
they are vibrating in this second mode. The coordinate combinations Q1, Q2, in terms of 
which this ‘single frequency motion’ occurs, are called ‘normal mode coordinates’ or normal 
coordinates for short. 

In general, the initial conditions will not be such that the motion is a pure mode; both 
Qi(t) and Q(t) will be non-zero. From (5.12) we have 


alt) = [Qi(t) + Qo(t)]/V2 (5.19) 


and 
q(t) = [Qi(t) — Q2(t)]/V2 (5.20) 


so that qı and q2 are expressed as a sum of two terms oscillating with frequencies wı and 
w2. We say the system is in ‘a superposition of modes’. Nevertheless, the mode idea is still 
very important as regards the total energy of the system, as we shall now see. The kinetic 
energy can be written in terms of the mode coordinates Q, as 


T = imQ? + imQ? (5.21) 
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while the potential energy V of (5.9) becomes 
V = $mwi Qj + mwg? = V(Q1, Q2). (5.22) 
The total energy is therefore 
E = [jmQ} + 3mQ3] + [Fmwji Qi + 5mw7Q3]. (5.23) 


This equation shows that, when written in terms of the normal coordinates, the total energy 
contains no couplings terms of the form Q1Q2:; indeed, the energy has the remarkable form 
of a simple sum of two independent uncoupled oscillators, one with characteristic frequency 
w 1, the other with frequency w2. The energy (5.23) has exactly the form appropriate to a 
system of two non-interacting ‘things’, each executing simple harmonic motion: the ‘things’ 
are actually the two modes. Modes do not interact, whereas the original atoms do! Of course, 
this decoupling in the expression for the total energy is reflected in the decoupling of the 
equations of motion for the Q variables: 


OV (Qi, Q2) 
OQ, 


It is most important to realize that the modes are non-interacting by virtue of the fact 
that we ignored higher than quadratic terms in V(q1,q2). Although the simple change of 
variables (q1,q2) > (Q1, Q2) of (5.12) does remove the qıq2 coupling, this would not be 
the case if, say, cubic terms in V were to be considered. Such higher order ‘anharmonic’ 
corrections would produce couplings between the modes—indeed, this will be the basis of 
the quantum field theory description of particle interactions (see the following chapter)! 

The system under discussion had just two degrees of freedom. We began by describing 
it in terms of the obvious degree of freedom, the physical displacements of the two atoms qı 
and q2. But we have learned that it is very illuminating to describe it in terms of the normal 
coordinate combinations Qı and Q2. The normal coordinates are really the relevant degrees 
of freedom. Of course, for just two particles, the choice between the q,’s and the Q,’s may 
seem rather academic; but the important point—and the reason for going through these 
simple manipulations in detail—is that the basic idea of the normal mode, and of normal 
coordinates, generalizes immediately to the much less trivial N-atom problem (and also to 
the field problem). For N atoms there are (for one-dimensional displacements) N degrees 
of freedom, and if we take them to be the actual atomic displacements, the total energy will 
be 


m, = — r=1,2. (5.24) 


N 
E=) im@+V(q,---,4) (5.25) 
r=1 
which includes all the couplings between atoms. We assume, as before, that the q,.’s are 
small enough so that only quadratic terms need to be kept in V (a constant is as usual 
irrelevant, and the linear terms vanish if the q,.’s are the displacements from equilibrium). 
In this case, the equations of motion will be linear. By a linear transformation of the form 
(generalizing (5.12)) 


N 
Qr = 5 arsds (5.26) 
s=l1 


it is possible to write E as a sum of N separate terms, just as in (5.23): 


N 
E=S [imQ + imu. (5.27) 


all 
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The Q,’s are the normal coordinates and the w,.’s are the normal frequencies, and there 
are N of them. If only one of the Q,’s is non-zero, the N atoms are moving in a single 
mode. The fact that the total energy in (5.27) is a sum of N single-mode energies allows 
us to say that our N-atom solid behaves as if it consisted of N separate and free harmonic 
oscillators—which, however, are not to be identified with the coordinates of the original 
atoms. Once again, and now much more crucially, it is the mode coordinates that are the 
relevant degrees of freedom rather than those of the original particles. 

The second stage in our programme is to treat such systems quantum mechanically, as 
we should certainly have to for a real solid. It is still true that—if the potential energy 
is a quadratic function of the displacements—the transformation (5.26) allows us to write 
the total energy as a sum of N mode energies, each of which has the form of a harmonic 
oscillator. Now, however, these oscillators obey the laws of quantum mechanics, so that each 
mode oscillator exists only in certain definite states, whose energy eigenvalues are quantized. 
For each mode of frequency wr, the allowed energy values are 


Er = (nr + 4) hw, (5.28) 


where n, is a positive integer or zero. This is in sharp contrast to the classical case, of 
course, in which arbitrary values are allowed for the oscillator energies. The total energy 
eigenvalue then has the form 


N 
E =Y (n, + $)huy. (5.29) 
r=1 


The frequencies w, are determined by the interatomic forces and are common to both 
the classical and quantum descriptions; in quantum theory, though, the states of definite 
energy of the vibrating N-body system are characterized by the values of a set of integers 
(n1,N2,...,nn), which determine the energies of each mode oscillator. 

For each mode oscillator, iw, measures the quantum of vibrational energy; the energy 
of an allowed mode state is determined uniquely by the number n, of such quanta of energy 
in the state. We now make a profound reinterpretation of this result (first given, almost en 
passant by Born, Heisenberg and Jordan (Born et al. 1926) in one of the earliest papers on 
quantum mechanics). We forget about the original N degrees of freedom qi, q2,..., qy and 
the original N ‘atoms’, which indeed are only remembered in (5.29) via the fact that there 
are N different mode frequencies wr. Instead we concentrate on the quanta and treat them 
as ‘things’ which really determine the behaviour of our quantum system. We say that ‘in a 
state with energy (ny + +) fiw, there are n, quanta present’. For the state characterized by 
(ni, N2,..., ny) there are nı quanta of mode 1 (frequency w1), n2 of mode 2,... and ny of 
mode N. Note particularly that although the number of modes N is fixed, the values of the 
n,’s are unrestricted, except insofar as the total energy is fixed. Thus we are moving from 
a ‘fixed number’ picture (N degrees of freedom) to a ‘variable number’ picture (the n,’s 
restricted only by the total energy constraint (5.29)). In the case of a real solid, these quanta 
of vibrational energy are called phonons. We summarize the point we have reached by the 
important statement that a phonon is an elementary quantum of vibrational excitation. 

Now we take one step backward in order, afterwards, to take two steps forward. We 
return to the classical mechanical model with N harmonically interacting degrees of freedom. 
It is possible to imagine increasing the number N to infinity, and decreasing the interatomic 
spacing a to zero, in such a way that the product Na stays finite, say Na = £. We then have 
a classical continuous system—for example a string of length £. (We stay in one dimension 
for simplicity.) The transverse vibrations of this string are now described by a field ¢(z, t), 
where at each point x of the string (x,t) measures the displacement from equilibrium, at 
the time t, of a small element of string around the point x. Thus we have passed from a 
system described by a discrete number of degrees of freedom, q,(t) or Q, (t), to one described 
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(6) 


FIGURE 5.3 
String motion in two normal modes: (a) r = 1 in equation (5.31); (b) r = 2. 


by a continuous degree of freedom, the displacement field ¢(a,t). The discrete suffix r has 
become the continuous argument x—and to prepare for later abstraction, we have denoted 
the displacement by ¢(a,t) rather than, say, q(x, t). 

In the continuous problem the analogue of the small-displacement assumption, which 
limited the potential energy in the discrete case to quadratic powers, implies that ¢(x, t) 
obeys the wave equation 

1 olx, t) = 0 g(x, t) 

2 Ot æ 
where c is the wave propagation velocity. Note that (5.30) is linear, but only by virtue 
of having made the small-displacement assumption. Again, we consider first the classical 
treatment of this system. Our aim is to find, for this continuous field problem, the analogue 
of the normal coordinates—or in physical terms, the modes of vibration—which were so 
helpful in the discrete case. Fortunately, the string’s modes are very familiar. By imposing 
suitable boundary conditions at each end of the string, we determine the allowed wavelengths 
of waves travelling along the string. Suppose, for simplicity, that the string is stretched 
between x = 0 and x = £l. This constrains ġ(x,t) to vanish at these end points. A suitable 
form for (x,t) which does this is 


(5.30) 


@,(x,t) = A (t) sin (=) (5.31) 
where r = 1,2,3,..., which expresses the fact that an exact number of half-wavelengths 


must fit onto the interval (0, £). Inserting (5.31) into (5.30), we find 
A, = —wA, (5.32) 


where 
w = rre e. (5.33) 


Thus the amplitude A, (t) of the particular waveform (5.31) executes simple harmonic mo- 
tion with frequency wr. Each motion of the string which has a definite wavelength also has 
a definite frequency; it is therefore precisely a mode. Figure 5.3(a) shows two snapshots of 
the string when it is oscillating in the mode for which r = 1, and figure 5.3(b) shows the 
same for the mode r = 2; these may be compared with figures 5.2(a) and (b). Just as in 
the discrete case, the general motion of the string is a superposition of modes 


o(a,t) = >A, (t) sin (>); (5.34) 


r=1 


in short, a Fourier series! 
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We must now examine the total energy of the vibrating string, which we expect to 
be greatly simplified by the use of the mode concept. The total energy is the continuous 
analogue of the discrete summation in (5.25), namely the integral 


p= [ (2) pe (3) da (5.35) 


where the first term is the kinetic energy and the second is the potential energy (p is the 
mass per unit length of the string, assumed constant). As noted earlier, the potential energy 
term arises from an approximation which limits it to the quadratic power. To relate this to 
the earlier discrete case, note that the derivative may be regarded as [¢(a + dx) — (a)]/dax 
as 6x — 0, so that the square of the derivative involves the ‘nearest neighbour coupling’ 
olx + 6%)¢(a), analogous to the q1q2 term in (5.9). 

Inserting (5.34) into (5.35), and using the orthonormality of the sine functions on the 
interval (0, 2), one obtains (problem 5.1) the crucial result 


CO 


= (4/2) X [5p A? + $ pw? A?]. (5.36) 


r=1 


Indeed, just as in the discrete case, the total energy of the string can be written as a 
sum of individual mode energies. We note that the Fourier amplitude A, acts as a normal 
coordinate. Comparing (5.36) with (5.27), we see that the string behaves exactly like a 
system of independent uncoupled oscillators, the only difference being that now there are 
an infinite number of them, corresponding to the infinite number of degrees of freedom in 
the continuous field (x,t). The normal coordinates A,(t) are, for many purposes, a much 
more relevant set of degrees of freedom than the original displacements $(z, t). 

The final step is to apply quantum mechanics to this classical field system. Once again, 
the total energy is equivalent to that of a sum of (infinitely many) mode oscillators, each 
of which has to be quantized. The total energy eigenvalue has the form (5.29), except that 
now the sum extends to infinity: 


E= Yolo + L)hw,. (5.37) 


The excited states of the quantized field ¢(a,t) are characterized by saying how many 
phonons of each frequency are present; the ground state has no phonons at all. We remark 
that as £ — oo, the mode sum in (5.36) or (5.37) will be replaced by an integral over a 
continuous frequency variable. 

We have now completed, in outline, the programme introduced earlier, ending up with 
the quantization of a ‘mechanical’ system. All of the foregoing, it must be clearly em- 
phasized, is absolutely basic to modern solid state physics. The essential idea—quantizing 
independent modes—can be applied to an enormous variety of ‘oscillations’. In all cases the 
crucial concept is the elementary excitation—the mode quantum. Thus we have plasmons 
(quanta of plasma oscillations), magnons (magnetic oscillations), ..., as well as phonons 
(vibrational oscillations). All this is securely anchored in the physics of many-body systems. 

Now we come to the use of these ideas as an analogy, to help us understand the (pre- 
sumably non-mechanical) quantum fields with which we shall actually be concerned in this 
book—for example the electromagnetic field. Consider a region of space containing electro- 
magnetic fields. These fields obey (a three-dimensional version of) the wave equation (5.30), 
with c now standing for the speed of light. By imposing suitable boundary conditions, the 
total electromagnetic energy in any region of space can be written as a sum of mode ener- 
gies. Each mode has the form of an oscillator, whose amplitude is (see (5.31)) the Fourier 
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component of the wave, for a given wavelength. These oscillators are all quantized. Their 
quanta are called photons. Thus, a photon is an elementary quantum of excitation of the 
electromagnetic field. 

So far, the only kind of ‘particle’ we have in our relativistic quantum field theoretic world 
is the photon. What about the electron, say? Well, recalling Feynman again, ‘There is one 
lucky break, however—electrons behave just like light’. In other words, we shall also regard 
an electron as an elementary quantum of excitation of an ‘electron field’. What is ‘waving’ 
to supply the vibrations for this electron field? We do not answer this question just as we did 
not for the photon. We postulate a relativistic quantum field for the electron which obeys 
some suitable wave equation—in this case, for non-interacting electrons, the Dirac equation. 
The field is expanded as a sum of Fourier components, as with the electromagnetic field. 
Each component behaves as an independent oscillator degree of freedom (and there are, of 
course, an infinite number of them); the quanta of these oscillators are electrons. 

Actually this, though correctly expressing the basic idea, omits one crucial factor, which 
makes it almost fraudulently oversimplified. There is of course one very big difference be- 
tween photons and electrons. The former are bosons and the latter are fermions; photons 
have spin angular momentum of one (in unit of ñ), electrons of one-half. It is very difficult, 
if not downright impossible, to construct any mechanical model at all which has fermionic 
excitations. Phonons have spin-1, in fact, corresponding to the three states of polarization 
of the corresponding vibrational waves. But ‘phonons’ carrying spin-4 are hard to come by. 
No matter, you may say, Maxwell has weaned us away from jelly, so we shall be grown up 
and boldly postulate the electron field as a basic thing. 

Certainly, this is what we do. But we also know that fermionic particles, like electrons, 
have to obey an exclusion principle: no two identical fermions can have the same quantum 
numbers. In chapter 7, we shall learn how the idea sketched here must be modified for fields 
whose quanta are fermions. 


ESS 
5.2 The quantum field: (ii) Lagrange—Hamilton formulation 
5.2.1 The action principle: Lagrangian particle mechanics 


We must now make the foregoing qualitative picture more mathematically precise. It is clear 
that we would like a formalism capable of treating, within a single overall framework, the 
mechanics of both fields and particles, in both classical and quantum aspects. Remarkably 
enough, such a framework does exist (and was developed long before quantum field theory): 
Hamilton’s principle of least action, with the action defined in terms of a Lagrangian. We 
strongly recommend the reader with no prior acquaintance with this profound approach to 
physical laws to read chapter 19 of volume 2 of Feynman’s Lectures on Physics (Feynman 
1964). 

The least action approach differs radically from the more familiar one which can be 
conveniently be called ‘Newtonian’. Consider the simplest case, that of classical particle 
mechanics. In the Newtonian approach, equations of motion are postulated which involve 
forces as the essential physical input; from these, the trajectories of the particle can be 
calculated. In the least action approach, equations of motion are not postulated as basic, 
and the primacy of forces yields to that of potentials. The path by which a particle actually 
travels is determined by the postulate (or principle) that it has to follow that particular 
path, out of infinitely many possible ones, for which a certain quantity—the action—is 
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qd) 


FIGURE 5.4 
Possible space-time trajectories from ‘Here’ (q(t1)) to ‘There’ (q(t2)). 


minimized. The action S is defined by 


j= f * L(q(t),4(t)) at (5.38) 


where q(t) is the position of the particle as a function of time, q(t) is its velocity and the 
all-important function L is the Lagrangian. Given L as an explicit function of the variables 
q(t) and q(t), we can imagine evaluating S for all sorts of possible q(t)’s starting at time 
tı and ending at time t2. We can draw these different possible trajectories on a q versus t 
diagram as in figure 5.4. For each path we evaluate S; the actual path is the one for which 
S is smallest, by hypothesis. 

But what is L? In simple cases (as we shall verify later) L is just T — V, the difference 

of kinetic and potential energies. Thus for a single particle in a potential V 

L= mł? — V (x). (5.39) 
Knowing V(x), we can try and put the ‘action principle’ into action. However, how can we 
set about finding which trajectory minimizes S? It is quite interesting to play with some 
simple specific examples and actually calculate S for several ‘fictitious’ trajectories—i.e. 
ones that we know from the Newtonian approach are not followed by the particle—and try 
and get a feeling for what the actual trajectory that minimizes S might be like (of course 
it is the Newtonian one—see problem 5.2). But clearly this is not a practical answer to the 
general problem of finding the q(t) that minimizes S. Actually, we can solve this problem 
by calculus. 

Our problem is something like the familiar one of finding the point tp at which a certain 
function f(t) has a stationary value. In the present case, however, the function S is not a 
simple function of t—rather it is a function of the entire set of points q(t). It is a function of 
the function q(t), or a functional of q(t). We want to know what particular ‘qe(t)? minimizes 
S. 

By analogy with the single-variable case, we consider a small variation ôq(t) in the path 
from q(t1) to q(t2). At the minimum, the change 6S corresponding to the change ôq must 
vanish. This change in the action is given by 


5S = | , ( mone | abil) dt. (pan 
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Using ôġ(t) = d(dq(t))/dt and integrating the second term by parts yields 


és = [ou 5q(t) | saa -Ea avs io] (5.41) 


Since we are considering variations of path in which all trajectories start at tı and end at 
to, ôq(tı) = ôq(t2) = 0. So the condition that S be stationary is 


ath _d OL 


Since this must be true for arbitrary q(t), we must have 


OL d OL 
agit) E 


=0. (5.43) 


This is the celebrated Euler-Lagrange equation of motion. Its solution gives the ‘qe(t) which 
the particle actually follows. 
We can see how this works for the simple case (5.39) where q is the coordinate x. We 
have immediately 
OL/0% = mt = p (5.44) 


and 
OL/0x = —O0V/0x = F (5.45) 


where p and F are, respectively, the momentum and the force of the Newtonian approach. 
The Euler-Lagrange equation then reads 


F = dp/dt (5.46) 


precisely the Newtonian equation of motion. For the special case of a harmonic oscillator 
(obviously fundamental for the quantum field idea, as section 5.1 should have made clear), 
we have 

L = imt? — tmw’r? (5.47) 


which can be immediately generalized to N independent oscillators (see section 5.1) via 


N 


L=X_ (GMO; — zmo). (5.48) 


r=1 


For many dynamical systems, the Lagrangian has the form ‘T — V’ indicated in (5.47) 
and (5.48). 

Our next step will be to replace classical particle mechanics by quantum particle mechan- 
ics. The standard way to do this is via the Hamiltonian formulation of classical mechanics, 
which we will now briefly review for the simple system with Lagrangian (5.39). In Hamil- 
tonian dynamics, the variables used are not the Lagrangian ones of position x and velocity 
t, but rather the position x and the canonical momentum p, where p is defined by 


OL 


(5.49) 


P = 
The place of the Lagrangian is taken by the Hamiltonian H (x, p) which is defined by 


H(z, p) = pt — L. (5.50) 
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Using (5.39) for L we find p = ma, and placing this result in (5.50) we obtain 


H(z, p) = r +V(x) (5.51) 


2m 


which in this case is just the total energy, expressed in terms of x and p. Instead of the 
Euler-Lagrange equation we have the Hamiltonian equations of motion, which are 


OH 
and aH 
For the case (5.51) these equations yield 
p/m=żġ (5.54) 
and 
ù = —OV/0z. (5.55) 


Equation (5.54) is just the familiar relation of p to ¢, and (5.55) is the Newtonian equation 
of motion. In the same way, the reader may check that the Hamiltonian for the assembly of 
oscillators described by the Lagrangian (5.48) is 


N 
H=% ( T + Smu2Q?) (5.56) 


2m 


r=1 


where P, = mQ». 
With this in hand, we turn to quantum particle mechanics. 


5.2.2 Quantum particle mechanics à la Heisenberg—Lagrange—Hamilton 


It seems likely that a particularly direct correspondence between the quantum and the clas- 
sical cases will be obtained if we use the Heisenberg formulation (or ‘picture’) of quantum 
mechanics (see appendix I). In the Schrédinger picture, the dynamical variables such as po- 
sition x are independent of time, and the time dependence is carried by the wavefunction. 
Thus we seem to have nothing like the q(t)’s. However, one can always do a unitary trans- 
formation to the Heisenberg picture, in which the wavefunction is fixed and the dynamical 
variables change with time. This is what we want in order to parallel the classical quantities 
q(t). But of course there is one fundamental difference between quantum mechanics and 
classical mechanics: in the former, the dynamical variables are operators which in general 
do not commute. In particular, the fundamental commutator states that (A = 1) 


(a(t), pe) =i (5.57) 


where ` indicates the operator character of the quantity. Here f is defined by the general- 
ization of (5.44): o 
p = OL/04. (5.58) 


In this formulation of quantum mechanics we do not have the Schrédinger-type equation of 
motion. Instead we have the Heisenberg equation of motion 


A=-i[A, Ay (5.59) 
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where the Hamiltonian operator A is defined in terms of the Lagrangian operator Ê by (cf 
(5.50)) 


H=pg-L (5.60) 
and A is any dynamical observable. For example, in the oscillator case 
n 42 ni 
L = im — imu? (5.61) 
p = må (5.62) 


and 
A a oe ae 


H = — p + -mw ĝ (5.63) 
m 


which is the total energy operator. Note that p, obtained from the Lagrangian using (5.58), 
had better be consistent with the Heisenberg equation of motion for the operator A = fp. 
The Heisenberg equation of motion for A = p leads to 


p= —mw?g (5.64) 


which is an operator form of Newton’s law for the harmonic oscillator. Using the expression 
for p (5.62), we find - 
g = —w? 9. (5.65) 

Now, although this looks like the familiar classical equation of motion for the position of 
the oscillator—and recovering it from the Lagrangian formalism is encouraging—we must 
be very careful to appreciate that this is an equation stating how an operator evolves with 
time. Where the quantum particle will actually be found is an entirely different matter. By 
sandwiching (5.65) between wavefunctions, we can at once see that the average position 
of the particle will follow the classical trajectory (remember that wavefunctions are inde- 
pendent of time in the Heisenberg formulation). But fluctuations about this trajectory will 
certainly occur: a quantum particle does not follow a ray-like classical trajectory. Come to 
think of it, neither does a photon! 

In the original formulations of quantum theory, such fluctuations were generally taken 
to imply that the very notion of a ‘path’ was no longer a useful one. However, just as the 
differential equations satisfied by operators in the Heisenberg picture are quantum gener- 
alizations of Newtonian mechanics, so there is an analogous quantum generalization of the 
‘path-contribution to the action’ approach to classical mechanics. The idea was first hinted 
at by Dirac (1933, 1981, section 32), but it was Feynman who worked it out completely. The 
book by Feynman and Hibbs (1965) presents a characteristically fascinating discussion— 
here we only wish to indicate the central idea. We ask: how does a particle get from the 
point q(tı) at time tı to the point q(t2) at t2? Referring back to figure 5.4, in the classical 
case we imagined (infinitely) many possible paths q;(t), of which, however, only one was the 
actual path followed, namely the one we called qe(t) which minimized the action integral 
(5.38) as a functional of q(t). In the quantum case, however, we previously noted that a 
particle will no longer follow any definite path because of quantum fluctuations. But rather 
than, as a consequence, throwing away the whole idea of a path, Feynman’s insight was to 
appreciate that the ‘opposite’ viewpoint is also possible: since unique paths are forbidden in 
quantum theory, we should in principle include all possible paths! In other words, we take 
all the trajectories on figure 5.4 as physically possible (together with all the other infinitely 
many ways of accomplishing the trip). 

However, surely not all paths are equally likely: after all, we must presumably recover 
the classical trajectory as A — 0, in some sense. Thus we must find an appropriate weighting 
for the paths. Feynman’s recipe is beautifully simple. Weight each path by the factor 


gore (5.66) 
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where S is the action for that particular path. At first sight this is a rather strange proposal, 
since all paths—even the classical one—are weighted by a quantity which is of unit mod- 
ulus. But, of course, contributions of the form (5.66) from all the paths have to be added 
coherently—just as we superposed the amplitudes in the ‘two-slit’ discussion in section 2.5. 
What distinguishes the classical path q.(t) is that it makes S$ stationary under small changes 
of path; thus in its vicinity paths have a strong tendency to add up constructively, while 
far from it the phase factors will tend to produce cancellations. The amount a quantum 
particle can ‘stray’ from the classical path depends on the magnitude of the corresponding 
action relative to h, the quantum of action: the scale of coherence is set by A. 

In summary, then, the quantum mechanical amplitude to go from q(t1) to q(t2) is pro- 


portional to f 
Y ap G f raaa); (5.67) 


all paths q(t) 


There is an evident generalization to quantum field theory. We shall not, however, make 
use of the ‘path integral’ approach to quantum field theory in this volume. Its use was, in 
fact, decisive in obtaining the Feynman rules for non-Abelian gauge theories; and it is the 
only approach suitable for numerical studies of quantum field theories (how can operators 
be simulated numerically?). Nevertheless, for a first introduction to quantum field theory, 
there is still much to be said for the traditional approach based on ‘quantizing the modes’, 
and this is the path we shall follow in the rest of this volume. Not the least of its advantages 
is that it contains the intuitively powerful ‘calculus’ of creation and annihilation operators, 
as we now describe. We shall return to the path integral formalism in chapter 16 of volume 
2. 


5.2.3 Interlude: the quantum oscillator 


As we saw in section 5.1, we need to know the energy spectrum and associated states of a 
quantum harmonic oscillator. This is a standard problem, but there is one particular way 
of solving it—the ‘operator’ approach due to Dirac (1981, chapter 6)—that is so crucial to 
all subsequent development that we include a discussion here in the body of the text. 

For the oscillator Hamiltonian 


H= p + T (5.68) 


if p and ĝ were not operators, we could attempt to factorize the Hamiltonian in the form 
‘(q+ip)(q— ip) (apart from the factors of 2m and w). In the quantum case, in which ô and 
q do not commute, it still turns out to be very helpful to introduce such combinations. If 
we define the operator 


j= 77 (vrai + Gaz) (5.69) 


and its Hermitian conjugate 


at = 5 (vmma i) (5.70) 


ymw 
the Hamiltonian may be written as (see problem 5.4) 

H = L(âtà + aatyw = (ata + Ew. (5.71) 
The second form for H may be obtained from the first using the commutation relation 


between @ and at 
(a, ât] =1 (5.72) 
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derived using the fundamental commutator between p and ĝ. Using this basic commutator 
(5.72) and our expression for H, (5.71), one can prove the relations (see problem 5.4) 


[H, a] = —wa 
‘ (5.73) 
[Ê at] = wat. 
Consider now a state |n) which is an eigenstate of H with energy En: 
A\n) = E,|n). (5.74) 


Using this definition and the commutators (5.73), we can calculate the energy of the states 
(at|n)) and (a|n)). We find 


Alat) = (En +w)(at|ny) (5.75) 
H(a|n)) = (En —)(a|n)). (5.76) 
Thus the operators ât and â respectively raise and lower the energy of |n) by one unit of 
w (h = 1). Now since H ~ p? + @ with p and g Hermitian, we can prove that (7|H|w) is 


positive-definite for any state |Y}. Thus the operator â cannot lower the energy indefinitely: 
there must exist a lowest state |0) such that 


a0) = 0. (5.77) 


This defines the lowest-energy state of the system; its energy is 
H|0) = 4w|0) (5.78) 
the ‘zero-point energy’ of the quantum oscillator. The first excited state is 
|1) = a"0) (5.79) 


with energy (1 + $)w. The nth state has energy (n + iw and is proportional to (a')"|0). 
To obtain a normalization 


(n|n) = 1 (5.80) 
the correct normalization factor can be shown to be (problem 5.4) 
I putas 
In) = wai )"|0). (5.81) 
n! 


Returning to the eigenvalue equation for Ĥ, we have arrived at the result 
H|n) = (ata + wjn) = (n + 4)w|n) (5.82) 


so that the state |n) defined by (5.81) is an eigenstate of the number operator ù = a'a, with 
integer eigenvalue n: 
ûn) = nn). (5.83) 


It is straightforward to generalize all the foregoing to a system whose Lagrangian is a 
sum of N independent oscillators as in (5.48) (but we use ¢, here instead of Q, because the 
oscillators are already non-interacting): 


N 
a 42 R 
L=X (bmg, — jmur@). (5.84) 
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The required generalization of the basic commutation relations (5.57) is 


[Gr ôs] = irs (5.85) 
[Gr Gs] E [br; Ps] =0 


since the different oscillators labelled by the index r or s are all independent. The Hamil- 
tonian is (cf (5.27)) 


N 

H = YS -[(1/2m)p? + imo] (5.86) 
r=1 
N 

= Ņ (âlâ, + $)u, (5.87) 
r=1 


with â, and âÌ defined via the analogues of (5.69) and (5.70). Since the eigenvalues of each 
number operator i, = âlâ, are np, by the previous results, the eigenvalues of H indeed 
have the form (5.29), 


N 
B= So (n, + 4)wr. (5.88) 

=l 
The corresponding eigenstates are products |n1}|n2) ---|nw~) of N individual oscillator eigen- 
states, where |n,) contains n, quanta of excitation, of frequency w,; the product state is 
usually abbreviated to |n1, n2,..., ny). In the ground state of the system, each individual 
oscillator is unexcited: this state is |0,0,...,0), which is abbreviated to |0), where it is 

understood that 

a,\0) = 0 for all r. (5.89) 


The operators ât create oscillator quanta; the operators â, destroy oscillator quanta. 


5.2.4 Lagrange—Hamilton classical field mechanics 


We now consider how to use the Lagrange-Hamilton approach for a field, starting again 
with the classical case and limiting ourselves to one dimension to start with. 

As explained in the previous section, we shall have in mind the N —> oo limit of the N 
degrees of freedom case 


{q,(t);7r = 1,2,...,N} —> (a, t) (5.90) 


where x is now a continuous variable labelling the displacement of the ‘string’ (to picture a 
concrete system, see figure 5.5). At each point x we have an independent degree of freedom 
¢(x,t)—thus the field system has a ‘continuous infinity’ of degrees of freedom. We now 
formulate everything in terms of a Lagrangian density £: 


s= / dt L (5.91) 
where (in one dimension) 
b= f ave. (5.92) 


Equation (5.90) suggests that @ has dimension of [length], and since in the discrete case 
L=T-—V, £ has dimension [energy/length]. (In general £ has dimension [energy /volume].) 
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FIGURE 5.5 
The passage from a large number of discrete degrees of freedom (mass points) to a continuous 
degree of freedom (field). 


A new feature arises because ¢ is now a continuous function of x, so that £ can depend 
on 0¢/0x as well as on ¢ and ¢ = 0¢/0t: L = L(¢,0¢/0z, Q). 


As before, we postulate the same fundamental principle 
6S = 0 (5.93) 


meaning that the dynamics of the field ¢ is governed by minimizing S. This time the total 
variation is given by 


s= fa {355+ sania (ae) TARG ie 


Integrating the 6¢ by parts in t, and the 6 (0¢/0x) by parts in x, and discarding the resulting 
‘surface’ terms, we obtain 


o oL ð (ƏL 
ss= fa f arse 35- a (gear) ~ a (E) a 
Since 6¢@ is an arbitrary function, the requirement 6S = 0 yelds the Euler-Lagrange field 
equation 
OL ð OL 0 (OL 
-}=0. .96 
Op Ou (sansa) Ot (35) : 1580) 
The generalization to three dimensions is 
oL OL 0 (ƏL 
V. -}|=0. 5.97 
a ~ (awa) ~ (33) a 


As an example, consider 


1 (d¢\?7 1 ./d6\ 
C= 5.98 
P 5°( 3 of (3 ee 
where the factor p (mass density) and c (a velocity) have been introduced to get the dimen- 
sion of £ right. Inserting this into the Euler—Lagrangian field equation (5.96), we obtain 
oe 1076 
aS Sa = 9 5.99 
ðr? e OP ee) 
which is precisely the wave equation (5.30) for the one-dimensional string, now obtained 
via the Euler-Lagrange field equations. Note that the Lagrange density £ has the expected 
form (cf (5.48)) of ‘kinetic energy density minus potential energy density’. 
For the final step—the passage to quantum mechanics for a field system—we shall be 
interested in the Hamiltonian (total energy) of the system, just as we were for the discrete 
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case. Though we shall not actually use the Hamiltonian in the classical field case, we shall 
introduce it here, generalizing it to the quantum theory in the following section. We recall 
that Hamiltonian mechanics is formulated in terms of coordinate variables (‘q’) and mo- 
mentum variables (‘p’), rather than the q and q of Lagrangian mechanics. In the continuum 
(field) case, the Hamiltonian H is written as the integral of a density H (we remain in one 


dimension) 

H= fon (5.100) 
while the coordinates q, (t) become the ‘coordinate field’ ¢(x, t). The question is what is the 
corresponding ‘momentum field’? 


The answer to this is provided by a continuum version of the generalized momentum 
derived from the Lagrangian approach (cf equation (5.44)) 


p= OL /d4. (5.101) 


We define a ‘momentum field’ m(x, t)—technically called the ‘momentum canonically con- 
jugate to ¢@’—by 
m(a,t) = OL/00(z, t) (5.102) 


where £ is now the Lagrangian density. Note that m has dimensions of a momentum density. 
In the classical particle mechanics case, we define the Hamiltonian by 


H(p, q) = pq — L. (5.103) 


Here we define a Hamiltonian density H by 


H(¢, T) = n(x, t)d(a,t) — L. (5.104) 
Let us see how all this works for the one-dimensional string with £ given by 
1 /3\? 1 (a6 
= . mil 
Lp 5°( 3°) 2° \ ag ete) 
We have 
w(x, t) = p0d/dt (5.106) 
and 
= l Dds 4 fee, 
Hp = ra 5 rd per | a 
. tlie, »f Oy 
=o i + pe (2) (5.107) 
so that 


H, = [ reo F 500? (2E 2) dz. (5.108) 


This has exactly the form we expect (see (5.34)), thus verifying the plausibility of the above 
prescription. 
Inserting the mode expansion (5.34) into (5.92) and (5.105) we obtain the result (just 
as in (5.36) and problem 5.1) 
ETE y ee 5.109 
Pa 6 T p= 5 D/P eS ls (5. ) 
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confirming that the system is equivalent to an infinite number of oscillators. The momentum 
canonically conjugate to A, is 


OLp Lo x 
r= —- = ~pA, 5.110 
P= aq = 9? ( ) 
and the Hamiltonian is 
œ 2 
Pr, E 242 
H =% — +- A .111 
P m lp T qevr Tr (5 ) 


We may cast (5.111) into nicer form by the change of variables 


P, = V2/l py, Qr = V 0/2 Ar, (5.112) 
in terms of which 
ye Q? (5.113) 
a = 2p or r : 


just as in (5.56), with N — oo. 


5.2.5 Heisenberg—Lagrange—Hamilton quantum field mechanics 


Finally, we are ready to quantize classical field formalism, and arrive at a quantum field 
mechanics—at least for the scalar field #(x,t). If we were dealing with the case in which 
(x,t) represented the displacement of a one-dimensional stretched string, quantization 
would be straightforward. We would take the classical Hamiltonian (5.113) and promote the 
mode coordinates Q, and their conjugate momenta P, to operators satisfying commutation 
relations of the form (5.85). The rest of the analysis would be exactly as in equations (5.86) 
to (5.89), except that the number of modes N is infinite. But in the case of the general 
scalar field, we do not want to impose the boundary conditions ¢(0,t) = (£, t) = 0, which 
led to the mode expansion (5.34). It is then not so clear how to proceed. 

Fortunately, the Lagrange-Hamilton field formalism does indicate the way forward, which 
is one good reason for developing it in the first place. (Another is that it is very well suited 
to the analysis of symmetries, a crucial aspect of gauge theories—see chapter 7.) In the 
previous section we introduced the ‘coordinate-like’ field ¢(a,t) and (via the Lagrangian) 
the ‘momentum-like’ field m(x, t). To pass to the quantized version of the field theory, we 
mimic the procedure followed in the discrete case and promote both the quantities ¢ and 7 
to operators db and 7, in the Heisenberg picture. As usual, the distinctive feature of quantum 
theory is the non-commutativity of certain basic quantities in the theory—for example, the 
fundamental commutator (h = 1) 


[Gr (t), Ps (t)] = idrs (5.114) 


of the discrete case. Thus we expect that the operators db and 7 will obey some commutation 
relation which is a continuum generalization of (5.114). The commutator will be of the form 
lolx, t), #(y, t)], since—recalling figure 5.5—the discrete index r or s becomes the continuous 
variable x or y; we also note that (5.114) is between operators at equal times. The continuum 
generalization of the +s symbol is the Dirac 6 function, (x — y), with the properties 


Jo (x) da = 1 (5.115) 
JR (a — y) f(a) da = f(y) (5.116) 


for all reasonable functions f (see appendix E). Thus the fundamental commutator of quan- 
tum field theory is taken to be 


[d(a, t), #(y,t)] = i5(a — y) (5.117) 
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in the one-dimensional case, with obvious generalization to the three-dimensional case via 
the symbol 6°(a — y). Remembering that we have set A = 1, it is straightforward to check 
that the dimensions are consistent on both sides. Variables b and 7 obeying such a com- 
mutation relation are said to be ‘conjugate’ to each other. 

What about the commutator of two d's or two 7’s? In the discrete case, two different 
q@’s (in the Heisenberg picture) will commute at equal times, [@,(t), Gs(t)] = 0, and so will 
two different p’s. We therefore expect to supplement (5.117) with 


[4(2, t), d(y, t)] = [#(2,t), #(y, t)] = 0. (5.118) 
Let us now proceed to explore the effect of these fundamental commutator assumptions, 


for the case of the Lagrangian density which yielded the wave equation via the Euler— 
Lagrange equations, namely 


a\ 2 A\ 2 
2 1.fo 1 ð 
L= 5P @ spe (2) : (5.119) 


If we remove p and set c = 1, we obtain 


a\2 AN 2 
, 1fa¢\) 1(2a¢ 
a @ 2 (2) (Pezo 


for which the Euler-Lagrangian equation yields the field equation 


Po ad 
rr (5.121) 
We can think of (5.121) as a highly simplified (spin-0, one-dimensional) version of the wave 
equation satisfied by the electromagnetic potentials. We may guess, then, that the associated 
quanta are massless, as we shall soon confirm. 
The Lagrangian density (5.120) is our prototype quantum field Lagrangian (one often 
slips into leaving out the word ‘density’). Applying the quantized version of (5.95) we then 
have 


(0, t) = 2 = ĝe, t) (5.122) 
dg(x, t) 
and the Hamiltonian density is 
; 1 1 [ a6 : 
E ee, See 
H=TP c= 58 +3(38) f (5.123) 


The total Hamiltonian is 


aX 2 
ñ= fha]; R? + (2) dz. (5.124) 


It is not immediately clear how to find the eigenvalues and eigenstates of the operator 
Å. However, it is exactly at this point that all our preliminary work on normal modes 
comes into its own. If we can write the Hamiltonian as some kind of sum over independent 
oscillators—i.e. modes—we shall know how to proceed. For the classical string with fixed 


end points which was considered in section 5.1, the mode expansion was simply a Fourier 
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expansion. In the present case, we want to allow the field to extend throughout all of space, 
without the periodicity imposed by fixed-end boundary conditions. In that case, the Fourier 
series is replaced by a Fourier integral, and standing waves are replaced by travelling waves. 
For the classical field obeying the wave equation (5.30), there are plane-wave solutions 


ola, tee (5.125) 


where (c = 1) 
w=k (5.126) 


which is just the dispersion relation of light in vacuo. The general field may be Fourier 
expanded in terms of these solutions: 


at) =f - s falk) ttit 4 a” (penta tint] (5.127) 
po ZTV 2w 


where we have required ¢ to be real. (The rather fussy factors (27/2w)~+ are purely conven- 
tional, and determine the normalization of the expansion coefficients a, a* and â, ât later; 
in turn, the latter enter into the definition, and normalization, of the states—see (5.143)). 


Similarly, the ‘momentum field’ 7 = ¢ is expanded as 


T= [. Sas i) ate — a* (k)enike tie), (5.128) 


We quantize these mode expressions by promoting ¢ > ĝ, Tt — 7 and assuming the com- 
mutator (5.117). Thus we write 


z = ak a ka—i . Speers 
b= > [a pete + ât (k)e ea (5.129) 


and similarly for t. The commutator (5.117) now determines the commutators of the mode 
operators â and ât: 


[a(k), at (k’)] = 2rô(k — k’) 
[a(k), @(k’)] = [ât (k), at (k’)] = 0 


as shown in problem 5.6. These are the desired continuum analogues of the discrete oscillator 
commutation relations 


(5.130) 


lâr, al = Ors 


lâr, âs] = [â}, â}] = 0. 


Tr) S 


(5.131) 


The precise factor in front of the -function in (5.130) depends on the normalization choice 
made in the expansion of d, (5.129). Problem 5.6 also shows that the commutation relations 
(5.130) lead to (5.118) as expected. 

The form of the â, ât commutation relations (5.130) already suggests that the â(k) 
and a'(k) operators are precisely the single-quantum destruction and creation operators 
for the continuum problem. To verify this interpretation and find the eigenvalues of H, we 
now insert the expansion for ¢ and 7 into H of (5.124). One finds the remarkable result 
(problem 5.7) 


a dk T 
P= / dk {Lat (ayace) + a(k)at eh (5.132) 
oo 27 (2 
Comparing this with the single-oscillator result 


H = 1 (ala + aal)w (5.133) 


(ii) Lagrange-Hamilton Formulation 117 


shows that, as anticipated in section 5.1, each classical mode of the field can be quantized, 
and behaves like a separate oscillator coordinate, with its own frequency w = k. The operator 
at (k) creates, and a(k) destroys, a quantum of the k mode. The factor (27)~! in H arises 
from our normalization choice. 

We note that in the field operator ¢ of (5.129), those terms which destroy quanta go with 
the factor e~'”’, while those which create quanta go with e+*. This choice is deliberate and 
is consistent with the ‘absorption’ and ‘emission’ factors e** of ordinary time-dependent 
perturbation theory in quantum mechanics (cf equation (A.33) of appendix A). 

What is the mass of these quanta? We know that their frequency w is related to their 
wavenumber k by (5.126), which—restoring h’s and c’s—can be regarded as equivalent to 
hw = hick, or E = cp, where we use the Einstein and de Broglie relations. This is precisely 
the E-p relation appropriate to a massless particle, as expected. 

What is the energy spectrum? We expect the ground state to be determined by the 
continuum analogue of 


a,|0) = 0 for all r; (5.134) 


namely 


a(k)|0) =0 forall k. (5.135) 


However, there is a problem with this. If we allow the Hamiltonian of (5.132) to act on |0) 
the result is not (as we would expect) zero, because of the â(k)ât (k) term (the other term 
does give zero by (5.135)). In the single oscillator case, we rewrote Ga! in terms of aa by 
using the commutation relation (5.72), and this led to the ‘zero-point energy’, iw, of the 


oscillator ground state. Adopting the same strategy here, we write FH of (5.132) as 
~ dk dk 1 
H= J — al (k)a(k)w + I — —[a(k), at (k)]w. (5.136) 


Now consider H|0). We see from the definition of the vacuum (5.135) that the first term 
will give zero as expected—but the second term is infinite, since the commutation relation 
(5.130) produces the infinite quantity ‘d(0)’ as k + k’; moreover, the k integral diverges. 

This term is obviously the continuum analogue of the zero-point energy 5w—but because 
there are infinitely many oscillators, it is infinite. The conventional ploy is to argue that 
only energy differences, relative to a conveniently defined ground state, really matter—so 
that we may discard the infinite constant in (5.136). Then the ground state |0) has energy 
zero, by definition, and the eigenvalues of FH are of the form 


f of ia (5.137) 


aa 


where n(k) is the number of quanta (counted by the number operator ât (k)â(k)) of energy 
w =k. For each definite k, and hence w, the spectrum is like that of the simple harmonic 
oscillator. The process of going from (5.132) to (5.136) without the second term is called 
‘normally ordering’ the â and ât operators: in a ‘normally ordered’ expression, all ât’s are to 
the left of all a’s, with the result that the vacuum value of such expressions is by definition 
zero. 

It has to be admitted that the argument that only energy differences matter is false 
as far as gravity is concerned, which couples to all sources of energy. It would ultimately 
be desirable to have theories in which the vacuum energy came out finite from the start 
(as actually happens in ‘supersymmetric’ field theories—see for example Weinberg (1995), 
p 325); see also comment (3). 

We proceed on to the excited states. Any desired state in which excitation quanta are 
present can be formed by the appropriate application of a'(k) operators to the ground 
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state |0). For example, a two-quantum state containing one quantum of momentum kı and 
another of momentum kz may be written (cf (5.81)) 


lki, k2) œ â (ky) a" (k2)10). (5.138) 


A general state will contain an arbitrary number of quanta. 

Once again, and this time more formally, we have completed the programme outlined in 
section 5.1, ending up with the ‘quantization’ of a classical field ¢(a,t), as exemplified in the 
basic expression (5.129), together with the interpretation of the operators @(k) and at (k) 
as destruction and creation operators for mode quanta. We have, at least implicitly, still 
retained up to this point the ‘mechanical model’ of some material object oscillating—some 
kind of infinitely extended ‘jelly’. We now throw away the mechanical props and embrace 
the unadorned quantum field theory! We do not ask what is waving, we simply postulate 
a field—such as ¢—and quantize it. Its quanta of excitation are what we call particles—for 
example, photons in the electromagnetic case. 

We end this long section with some further remarks about the formalism, and the phys- 
ical interpretation of our quantum field ĝ. 


Comment (1) 


The alert reader, who has studied appendix I, may be worried about the following (possible) 
consistency problem. The fields @ and 7 are Heisenberg picture operators and obey the 
equations of motion 


læ, t) = —if#(2,t), H] (5.140) 
where H is given by (5.132). It is a good exercise to check (problem 5.8(a)) that (5.139) 
yields just the expected relation (x,t) = #(a,t) (cf (5.122)). Thus (5.140) becomes 


olx, t) = ilâ (x, t), H]. (5.141) 
However, we have assumed in our work here that ¢ obeyed the wave equation (cf.(5.121)) 


= 82 a 
$= let) (5.142) 


as a consequence of the quantized version of the Euler-Lagrange equation (5.96). Thus the 
right-hand sides of (5.141) and (5.142) need to be the same, for consistency—and they 
are, see problem 5.8(b). Thus—at least in this case—the Heisenberg operator equations of 
motion are consistent with the Euler-Lagrange equations. 


Comment (2) 


Following on from this, we may note that this formalism encompasses both the wave and 
the particle aspects of matter and radiation. The former is evident from the plane-wave 
expansion functions in the expansion of ĝ, (5.129), which in turn originate from the fact 
that ĝ obeys the wave equation (5.121). The latter follows from the discrete nature of the 
energy spectrum and the associated operators â, ât which refer to individual quanta i.e. 
particles. 


Comment (3) 


Next, we may ask: what is the meaning of the ground state |0} for a quantum field? It is 
undoubtedly the state with n(k) = 0 for all k, i.e. the state with no quanta in it—and hence 
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no particles in it, on our new interpretation. It is therefore the vacuum! As we shall see later, 
this understanding of the vacuum as the ground state of a field system is fundamental to 
much of modern particle physics—for example, to quark confinement and to the generation 
of mass for the weak vector bosons. Note that although we discarded the overall (infinite) 
constant in H , differences in zero-point energies can be detected; for example, in the Casimir 
effect (Casimir 1948, Kitchener and Prosser 1957, Sparnaay 1958, Lamoreaux 1997, 1998). 
These and other aspects of the quantum field theory vacuum are discussed in Aitchison 
(1985). 


Comment (4) 


Consider the two-particle state (5.138): |k1, k2} œ a@!(k,)at(k2)|0). Since the ât operators 
commute, (5.130), this state is symmetric under the interchange kı + kə. This is an in- 
evitable feature of the formalism as so far developed—there is no possible way of distin- 
guishing one quantum of energy from another, and we expect the two-quantum state to be 
indifferent to the order in which the quanta are put in it. However, this has an important 
implication for the particle interpretation: since the state is symmetric under interchange 
of the particle labels kı and k2, it must describe identical bosons. How the formalism is 
modified in order to describe the antisymmetric states required for two fermionic quanta 
will be discussed in section 7.2. 


Comment (5) 


Finally, the reader may well wonder how to connect the quantum field theory formalism 
to ordinary ‘wavefunction’ quantum mechanics. The ability to see this connection will be 
important in subsequent chapters and it is indeed quite simple. Suppose we form a state 
containing one quantum of the ĝ field, with momentum k’: 


|k’) = Nâ! (k’)|0) (5.143) 


where N is a normalization constant. Now consider the amplitude (0|@(a, t)|k’). We expand 
this out as 


(0|b(a, t)|&’) = (0| I< ~ lath (k)eike—iet 4 at (Rete tion] Nat (kO. (5.144) 


The ‘aa!’ term will give zero since (0|ât = 0. For the other term, we use the commutation 
relation (5.130) to write it as 


Ndk ; ; eik’a—iw't 
——__ al (k )a(k) + 276(k — kjekt] = N 5.145 
mY oi )a(k) + 27d( )] |0) T ( ) 


using the vacuum condition once again, and integrating over the ô function using the prop- 
erty (5.116) which sets k = k’ and hence w = w’. The vacuum is normalized to unity, 
(0|0) = 1. The normalization constant N can be adjusted according to the desired conven- 
tion for the normalization of the states and wavefunctions. The result is just the plane-wave 
wavefunction for a particle in the state |k’)! Thus we discover that the vacuum to one-particle 
matrix elements of the field operators are just the familiar wavefunctions of single-particle 
quantum mechanics. In this connection we can explain some common terminology. The path 
to quantum field theory that we have followed is sometimes called ‘second quantization’— 
ordinary single-particle quantum mechanics being the first-quantized version of the 
theory. 


(0| 
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5.3 Generalizations: four dimensions, relativity, and mass 


In the previous section we have shown how quantum mechanics may be married to field 
theory, but we have considered only one spatial dimension, for simplicity. Now we must 
generalize to three and incorporate the demands of relativity. This is very easy to do in 
the Lagrangian approach, for the scalar field ¢(a,t). ‘Scalar’ means that the field has only 
one independent component at each point (x, t)—unlike the electromagnetic field, for in- 
stance, for which the analogous quantity has four components, making up a 4-vector field 
A} (x,t) = (Ao(a,t), A(x, t)) (see chapter 7). In the quantum case, a one-component field 
(or wavefunction) is appropriate for spin-0 particles. 
As we saw in (5.97), the three-dimensional Euler-Lagrange equations are 


OL OL ðo (ƏL 
v. ~ ) = 5.146 
Og O(Vo) Ot ( Ob ) ( ) 
which may immediately be rewritten in relativistically invariant form 
oL oL 
fa) = 5.147 
AET en 


where 0,, = 0/0x". Similarly, the action 


S= fu farec = Jai (5.148) 


will be relativistically invariant if £ is, since the volume element d*z is invariant. Thus, to 
construct a relativistic field theory, we have to construct an invariant density £ and use the 
already given covariant Euler-Lagrange equation. Thus our previous string Lagrangian 


_1 (86)? 1 4 (ae) 
= 50( $2) 5 Pe (2) (5.149) 


with p = c = 1 generalizes to 


L = 48, p0" (5.150) 
and produces the invariant wave equation 
32 
0,0" = (Fz = v?) o=0. (5.151) 


All of this goes through just the same when the fields are quantized. 

This invariant Lagrangian describes a field whose quanta are massless. To find the La- 
grangian for the case of massive quanta, we need to find the Lagrangian that gives us the 
Klein—Gordon equation (see section 3.1) 


(+ m*) g(a, t) =0 (5.152) 


via the Euler-Lagrangian equations. 
The answer is a simple generalization of (5.150): 


Lra = 40,,¢0"¢ — img’. (5.153) 
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The plane-wave solutions of the field equation—now the KG equation—have frequencies 
(or energies) given by 
we =k? +m? (5.154) 


which is the correct energy-momentum relation for a massive particle. 
How do we quantize this field theory? The four-dimensional analogue of the Fourier 
expansion of the field ¢ takes the form 


pi H d?k A —ik-x A ik-x 
OE E a ho 4 at (jel) (5.155) 


with a similar expansion for the ‘conjugate momentum’ 7 = @: 
p J 


a = d?k : ^ —ik-x ^ ik-x 
L J. pya CAA — al (He) (5.156) 


Here k- x is the four-dimensional dot product k- x = wt — k- æ, and w = +(k? + m?)"2. 
The Hamiltonian is found to be 


Íra = [Seti = J dria? + Ve- V+ me? (5.157) 


and this can be expressed in terms of the â’s and the ât’s using the expansion for db and 7 
and the commutator 


[a(k), at (k’)] = (27)’8? (k — k’) (5.158) 
with all others vanishing. The result is, as expected, 
a 1 d?k 
ae at (kG alkat 
Hke = J (Ons (k)G(k) + G(k)a'(k)|w (5.159) 


and, normally ordering as usual, we arrive at 


. 3 
Hxea = | Ei Hale. (5.160) 


This supports the physical interpretation of the mode operators ât and â as creation and 
destruction operators for quanta of the field ĝ as before, except that now the energy- 
momentum relation for these particles is the relativistic one, for particles of mass m. 

Since @ is real (d = ĝt) and has no spin degrees of freedom, it is called a real scalar 
field. Only field quanta of one type enter—those created by ât and destroyed by â. Thus 
@ would correspond physically to a case where there was a unique particle state of a given 
mass m—for example the 7° field. Actually, of course, we would not want to describe the 
n° in any fundamental sense in terms of such a field, since we know it is not a point-like 
object (‘d’ is defined only at the single space-time point (#,t)). The question of whether 
true ‘elementary’ scalar fields exist in nature is an interesting one. In the SM, as we shall 
eventually see in volume 2, the Higgs field is a scalar field (though it contains several 
components with different charge). It remains to be seen if this field—and the associated 
quantum, the Higgs boson—is elementary or composite. 

We have learned how to describe free relativistic spinless particles of finite mass as the 
quanta of a relativistic quantum field. We now need to understand interactions in quantum 
field theory. 
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Problems 
5.1 Verify equation (5.36). 
5.2 Consider one-dimensional motion under gravity so that V(x) = —mgz in (5.39). Eval- 
uate S of (5.38) for tı = 0, t2 = to, for three possible trajectories: 
(a) x(t) =at, 
(b) x(t) = $gt? (the Newtonian result) and 
(c) x(t) = bt? 
where the constants a and b are to be chosen so that all the trajectories end at the same 
point æ(to). 
5.3 


(a) Use (5.57) and (5.63) to verify that 


Ê = må 


is consistent with the Heisenberg equation of motion for A= q. 
(b) By similar methods verify that 
p= —mw? 4. 


(a) Rewrite the Hamiltonian A of (5.63) in terms of the operators @ and ât. 


(b) Evaluate the commutator between â and ât and use this result together with your 
expression for H from part (a) to verify equation (5.73). 


(c) Verify that for |n) given by equation (5.81) the normalization condition 
(n|n) = 1 
is satisfied. 
(d) Verify (5.83) directly using the commutation relation (5.72). 
5.5 Treating Y% and y“ as independent classical fields, show that the Lagrangian density 


L= igh — (1/2m) Vy" : VY 
gives the Schrödinger equation for w and Y* correctly. 


5.6 


(a) Verify that the commutation relations for â(k) and ât (k) (equations (5.130)) are 
consistent with the equal time commutation relation between ¢ and 7 (equa- 
tion (5.117)), and with (5.118). 


(b) Consider the unequal time commutator D(21, £2) = [(x1, t1), ¢(a2, t2)|, where 
ġ is a massive KG field in three dimensions. Show that 
d°k 


D(x1,22) = J aage O gaia) (5.161) 
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where k: (x1 — £2) = E(t; — t2) —k- (xı — a2), and E = (k? + m?)!/2. Note that 
D is not an operator, and that it depends only on the difference of coordinates 
zı — £2, consistent with translation invariance. Show that D(x1, 22) vanishes for 
tı = t2. Explain why the right-hand side of (5.161) is Lorentz invariant (see the 
exercise in appendix E), and use this fact to show that D(x, 22) vanishes for all 
space-like separations (xı — x2)? < 0. Discuss the significance of this result—or 
see the discussion in section 6.3.2! 


5.7 Insert the plane-wave expansions for the operators ĝ and 7 into the equation for H, 
(5.124), and verify equation (5.132). [Hint: note that w is defined to be always positive, so 
that (5.126) should strictly be written w = |k|.] 


5.8 
(a) Use (5.117) and (5.124) to verify that (x,t) = (w,t) is consistent with the 


Heisenberg equation of motion for (x,t). [Hint: write the integral in (5.124) as 
over y, not x!) 


(b) Similarly, verify the consistency of (5.141) and (5.121). 
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Quantum Field Theory II: Interacting Scalar Fields 


6.1 Interactions in quantum field theory: qualitative introduction 


In the previous chapter we considered only free—i.e. non-interacting—quantum fields. The 
fact that they are non-interacting is evident in a number of ways. The mode expansions 
(5.129) and (5.155), are written in terms of the (free) plane-wave solutions of the associated 
wave equations. Also the Hamiltonians turned out to be just the sum of individual oscillator 
Hamiltonians for each mode frequency, as in (5.132) or (5.159). The energies of the quanta 
add up—they are non-interacting quanta. Finally, since the Hamiltonians are just sums of 
number operators 

(k) = al (k)a(k) (6.1) 


it is obvious that each such operator commutes with the Hamiltonian and is therefore 
a constant of the motion. Thus two waves, each with one excitation quantum, travelling 
towards each other will pass smoothly through each other and emerge unscathed on the 
other side—they will not interact at all. 

How can we get the mode quanta to interact? If we return to our discussion of classical 
mechanical systems in section 5.1, we see that the crucial step in arriving at the ‘sum over 
oscillators’ form for the energy was the assumption that the potential energy was quadratic 
in the small displacements qr. We expect that ‘modes will interact’ when we go beyond this 
harmonic approximation. The same is true in the continuous (wave or field) case. In the 
derivation of the appropriate wave equation you will find that somewhere an approximation 
like tang = ¢ or sing ~ ¢ is made. This linearizes the equation, and solutions to linear 
equations can be linearly superposed to make new solutions. If we retain higher powers of 
@, such as ¢°, the resulting nonlinear equation has solutions that cannot be obtained by 
superposing two independent solutions. Thus two waves travelling towards each other will 
not just pass smoothly through each other: various forms of interaction and distortion of 
the original waveforms will occur. 

What happens when we quantize such anharmonic systems? To gain some idea of the 
new features that emerge, consider just one ‘anharmonic oscillator’ with Hamiltonian 


H = (1/2m)p? + 4mw?@ + AG. (6.2) 


In terms of the â and ât combinations this becomes 


P 1 À 
= ata + aat a 4 aty3 

H = 5 (4 a+ aa Poe ) (6.3) 
= Ho+ AH’ (6.4) 


where Ĥo is our previous free oscillator Hamiltonian. The algebraic tricks we used to find 
the spectrum of Ho do not work for this new H because of the addition of the H’ interaction 
term. In particular, although Ao commutes with the number operator âtâ, H’' does not. 
Therefore, whatever the eigenstates of H are, they will not in general have a definite number 
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of ‘Ho quanta’. In fact, we cannot find an exact algebraic solution to this new eigenvalue 
problem, and we must resort to perturbation theory or to numerical methods. 

The perturbative solution to this problem treats AH! asa perturbation and expands the 
true eigenstates of H in terms of the eigenstates of Ho: 


IP) = X` craln). (6.5) 


From this expansion we see that, as expected, the true eigenstates |7) will ‘contain different 
numbers of Ho quanta’: \Crn|? is the probability of finding n ‘Ho quanta’ in the state 
|r). Perturbation theory now proceeds by expanding the coefficients crn and exact energy 
eigenvalues E,. as power series in the strength A of the perturbation. For example, the exact 


energy eigenvalue has the expansion 


B, = BO 4,8 + EP) 4... (6.6) 
where : 
Ho|r) = EO |r) (6.7) 
and 
EY = (r|H"\r) (6.8) 
TI rT} 
B® = 5 (r| H's} (s| H Ir) (6.9) 


afr EO — EP 


To evaluate the second-order shift in energy, we therefore need to consider matrix elements 
of the form 
(s|(â + ât)?|r). (6.10) 

Keeping careful track of the order of the â and ât operators, we can evaluate these matrix 
elements and find, in this case, that there are non-zero matrix elements for states (s| = 
(r+ 3], (r +1], (r — 1| and (r — 3}. 

What about the quantum mechanics of two coupled nonlinear oscillators? In the same 
way, the general state is assumed to be a superposition 


IF) = So ermine lt) na) (6.11) 


nine 


of states of arbitrary numbers of quanta of the unperturbed oscillator Hamiltonians Hoa) 
and Ho). States of the unperturbed system contain definite numbers nı and nz, say, of the 
‘T’ and ‘2’ quanta. Perturbation calculations of the interacting system will involve matrix 
elements connecting such |n1)|n2) states to states |n4)|n,) with different numbers of these 
quanta. 

All this can be summarized by the remark that the typical feature of quantized inter- 
acting modes is that we need to consider processes in which the numbers of the different 
mode quanta are not constants of the motion. This is, of course, exactly what happens when 
we have collisions between high-energy particles. When far apart the particles, definite in 
number, are indeed free and are just the mode quanta of some quantized fields. But, when 
they interact, we must expect to see changes in the numbers of quanta, and can envisage 
processes in which the number of quanta which emerge finally as free particles is different 
from the number that originally collided. From the quantum mechanical examples which we 
have discussed, we expect that these interactions will be produced by terms like ¢° or +, 
since the free—‘harmonic’—case has 2, analogous to @? in the quantum mechanics example. 
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Such terms arise in the solid state phonon application precisely from anharmonic corrections 
involving the atomic displacements. These terms lead to non-trivial phonon—phonon scat- 
tering, the treatment of which forms the basis of the quantum theory of thermal resistivity 
of insulators. In the quantum field theory case, when we have generalized the formalism 
to fermions and photons, the nonlinear interaction terms will produce e*e~ scattering, qq 
annihilation, and so on. As in the quantum mechanical case, the basic calculational method 
will be perturbation theory. 

As remarked earlier, the trouble with all these ‘real-life’ cases is that they involve sig- 
nificant complications due to spin; the corresponding fields then have several components, 
with attendant complexity in the solutions of the associated free-particle wave equations 
(Maxwell, Dirac). So in this chapter we shall seek to explain the essence of the perturbative 
approach to quantum field dynamics—which we take to be essentially the Feynman graph 
version of Yukawa’s exchange mechanism—in the context of simple models involving only 
scalar fields; Maxwell (vector) and Dirac (spinor) fields will be introduced in the following 
chapter. The route we follow to the ‘Feynman rules’ is the one first given (with remarkable 
clarity) by Dyson (1949a), which rapidly became the standard formulation. 

Before proceeding it may be worth emphasizing that in introducing a ‘non-harmonic’ 
term such as be and thus departing from linearity in that sense, we are in no way affecting 
the basic linearity of state vector superposition in quantum mechanics (cf (6.11)), which 
continues to hold. 


E: See e 


6.2 Perturbation theory for interacting fields: the Dyson expansion 
of the S-matrix 


On the third day of the journey a remarkable thing happened; going into a sort of 
semi-stupor as one does after 48 hours of bus-riding, I began to think very hard about 
physics, and particularly about the rival radiation theories of Schwinger and Feynman. 
Gradually my thoughts grew more coherent, and before I knew where I was, I had solved 
the problem that had been in the back of my mind all this year, which was to prove the 
equivalence of the two theories. 


[From a letter from F. J. Dyson to his parents, 18 September 1948, as quoted in Schweber 
(1994), p 505.] 
For definiteness, let us consider the Lagrangian 


Ê = 30,00" — 3m?¢? — AG? = Lra — AG? (6.12) 


with À > 0. Equation (6.12) is like ‘Ê = T — V’ where V = 4(V¢)? + 4m?¢? + AQ? is the 
‘potential’. Though simple, this Lagrangian is unfortunately not physically sensible. The 
classical particle analogue potential would have the form V(q) = wg? + àq. If we sketch 
V(q) as a function of q we see that, for small A, it retains the shape of an oscillator well 
near q = 0, but for q sufficiently large and negative it will ‘turn over’, tending ultimately 
to —oo as q  —oo. Classically we expect to be able to set up a successful perturbation 
theory for oscillations about the equilibrium position q = 0, provided that the amplitude 
of the oscillations is not so large as to carry the particle over the ‘lip’ of the potential; in 
the latter case, the particle will escape to q = —ov, invalidating a perturbative approach. 
In the quantum mechanical case the same potential V(q) is more problematical, since the 
particle can tunnel through the barrier separating it from the region where V + —oo. This 
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means that the ground state will not be stable. An analogous disease affects the quantum 
field case—the supposed vacuum state will be unstable, and indeed the energy will not be 
positive-definite. 

Nevertheless, as the reader may already have surmised, and we shall confirm later in this 
chapter, the ‘¢-cubed’ interaction is precisely of the form relevant to Yukawa’s exchange 
mechanism. As we have seen in the previous section, such an interaction will typically give 
rise to matrix elements between one-quantum and two-quantum states, for example, exactly 
like the basic Yukawa emission and absorption process. In fact, all that is necessary to make 
the b°-type interaction physical is to let it describe, not the ‘self-coupling’ of a single field, 
but the ‘interactive coupling’ of at least two different fields. For example, we may have 
two scalar fields with quanta ‘A’ and ‘B’, and an interaction between them of the form 
dG? os. This will allow processes such as A + A + B. Or we may have three such fields, 
and an interaction A\dadsec, allowing A + B + C and similar transitions. In these cases 
the problems with the be self-interaction do not arise. (Incidentally those problems can 
be eliminated by the addition of a suitable higher-power term, for instance go*.) In later 
sections we shall be considering the ‘ABC’ model specifically, but for the present it will be 
simpler to continue with the single field db and the self-interaction rd, as described by the 
Lagrangian (6.12). The associated Hamiltonian is 


Pte fl (6.13) 


where (as is usual in perturbation theory) we have separated the Hamiltonian into a part 
we can handle exactly, which is the free Klein—Gordon Hamiltonian 


Higa = | defixa =} f dala? + (Vd)? + me (6.14) 
and the part we shall treat perturbatively 


Ĥ' = [aon = à f aad. (6.15) 


6.2.1 The interaction picture 


We begin with a crucial formal step. In our introduction to quantum field theory in the 
previous chapter, we worked in the Heisenberg picture (HP). There, however, we only dealt 
with free (non-interacting) fields. The time dependence of the operators as given by the mode 
expansion (5.155) is that generated by the free KG Hamiltonian (6.14) via the Heisenberg 
equations of motion (see problem 5.8). But as soon as we include the interaction term A’, 
we cannot make progress in the HP, since we do not then know the time dependence of the 
operators—which is generated by the full Hamiltonian H = Hyg + H’. 

Instead, we might consider using the Schrödinger picture (SP) in which the states change 


with time according to 
d 


Alyt) = ig hv) (6.16) 


and the operators are time-independent (see appendix I). Note that although (6.16) is a 
‘Schrödinger picture’ equation, there is nothing non-relativistic about it: on the contrary, 
H is the relevant relativistic Hamiltonian. In this approach, the field operators appearing 
in the density H are all evaluated at a fixed time, say t = 0 by convention, which is the 
time at which the Schrödinger and Heisenberg pictures coincide. At this fixed time, mode 
expansions of the form (5.155) with t = 0 are certainly possible, since the basis functions 
form a complete set. 
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One problem with this formulation, however, is that it is not going to be manifestly 
‘Lorentz invariant’ (or covariant), because a particular time (t = 0) has been singled out. 
In the end, physical quantities should come out correct, but it is much more convenient to 
have everything looking nice and consistent with relativity as we go along. This is one of 
the reasons for choosing to work in yet a third ‘picture’, an ingenious kind of half-way-house 
between the other two, called the ‘interaction picture’ (IP). We shall see other good reasons 
shortly. 

In the HP, all the time dependence is carried by the operators and none by the state, 
while in the SP it is exactly the other way around. In the IP, both states and operators 
are time-dependent but in a way that is well adapted to perturbation theory, especially in 
quantum field theory. The operators have a time dependence generated by the free Hamil- 
tonian Ĥo, say, and so a ‘free-particle’ mode expansion like (5.155) survives intact (here 
Ho = = Hxa). The states have a time dependence generated by the interaction Ĥ'. Thus as 
H’ — 0 we return to the free-particle HP. 

The way this works formally is as follows. In terms of the time-independent SP operator 
A (cf appendix I), we define the corresponding IP operator Âr(t) by 


Ar(t) = fot Ae iHot (6.17) 


This is just like the definition of the HP operator A(t t) in appendix I, except that Hp appears 
instead of the full H. It follows that the time dependence of Aj(t) is given by (I.8) with 
H = Ĥo: 


= zaji. (6.18) 


Equation (6.18) can also, of course, be derived by carefully differentiating (6.17). Thus— 
as mentioned already—the time dependence of År(t) is generated by the free part of the 
Hamiltonian, by construction. 

As applied to our model theory (6.12), then, our field é will now be specified as being 
in the IP, ¢;(a,t). What about the field canonically conjugate to ¢1(t), in the case when 
the interaction is included? In the HP, as long as the interaction does not contain time 
derivatives, as is the case here, the field canonically conjugate to the interacting field remains 
the same as the free-field case: 


A ee re T (6.19) 
Od(a,t)  Od(a,t) 


so that we continue to adopt the equal-time commutation relation 


[d(w, t), #(y, t)] = 1° (a — y) (6.20) 


for the Heisenberg fields. But the IP fields are related to the HP fields by a unitary trans- 
formation U, as we can see by combining (6.17) with (1.7): 


Ax(t) = ellote itt A (4) eit .—iHot 


UA(t)o-? (6.21) 


where U = eiMote-ift and it is easy to check that uut S00 = Î. So taking equa- 
tion (6.20) and pre-multiplying by U and post-multiplying by U~! on both sides, we obtain 


[d1(2, t), ĉr(y, t)] = id? (a — y) (6.22) 
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showing that, in the interacting case, the IP fields dt and îy obey the free field commutation 
relation. Thus in the IP case the interacting fields obey the same equations of motion and the 
same commutation relations as the free-field operators. It follows that the mode expansion 
(5.155), and the commutation relations (5.158) for the mode creation and annihilation 
operators, can be taken straight over for the IP operators. 

We now turn to the states in the IP. To preserve consistency between the matrix elements 
in the Schrödinger and interaction pictures (cf the step from (I.6) to (I.7)) we define the 
corresponding IP state vector by 


IVE) = t h(t) (6.23) 


in terms of the SP state |y(t)). We now use (6.23) to find the equation of motion of |7)(t))r. 
We have 


igo) = e% -ooe +i 60) } 
= PH Hoy) + (Ho + ANO) 
eot fr p(t) 


= eot f'e~ifot y(t), (6.24) 


or 
d p 
iq Oh = HOA) (6.25) 
where . . 
At = eot Fy’ e—itot (6.26) 


is the interaction Hamiltonian in the interaction picture. The italicised words are important: 
they mean that all operators in H/ have the (known) free-field time dependence, which would 
not be the case for H’ in the HP. Thus, as mentioned earlier, the states in the IP have a 
time dependence generated by the interaction Hamiltonian, and this derivation has shown 
us that it is, in fact, the interaction Hamiltonian in the IP which is the appropriate generator 
of time change in this picture. 

Equation (6.25) is a slightly simplified form of the Tomonaga-Schwinger equation, which 
formed the starting point of the approach to QED followed by Schwinger (Schwinger 1948b, 
1949a, b) and independently by Tomonaga and his group (Tomonaga 1946, Koba, Tati and 
Tomonaga 1947a, b, Kanesawa and Tomonaga 1948a, b, Koba and Tomonaga 1948, Koba 
and Takeda 1948, 1949). 


6.2.2 The S-matrix and the Dyson expansion 


We now start the job of applying the IP formalism to scattering and decay processes in 
quantum field theory, treated in perturbation theory; for this, following Dyson (1949a, b), 
the crucial quantity is the scattering matrix, or S-matrix for short, which we now introduce. 
A scattering process may plausibly be described in the following terms. At a time t + —oo, 
long before any interaction has occurred, we expect the effect of Ht to be negligible so that, 
from (6.25), |¢(—oo))r will be a constant state vector |i), which is in fact an eigenstate of Ho. 
Thus |i) will contain a certain number of non-interacting particles with definite momenta, 
and |w(—oo)); = |i). As time evolves, the particles approach each other and may scatter, 
leading in the distant future (at t + 00) to another constant state |q)(oo)); containing non- 
interacting particles. Note that |¢(oo)); will in general contain many different components, 
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each with (in principle) different numbers and types of particles; these different components 
in |¢b(oo))y will be denoted by |f). The S-operator is now defined via 


|b(c0))1 = $|b(—00))1 = Sli). (6.27) 


A particular S-matrix element is then the amplitude for finding a particular final state |f) 
in |¢p(00))1: , 
(flp (00)} = (Sli) = Ss. (6.28) 
Thus we may write 
lylo) = XC IA (£b(co))1 = 2 Salf). (6.29) 
f 
It is clear that it is these S-matrix elements Sg that we need to calculate, and the associated 
probabilities |Sg]?. i 
Before proceeding we note an important property of S. Assuming that |¢(0o)); and |i) 
are both normalized, we have 


1 = 1(%(00)|(00))1 = (ilôt Sli) = (ili) (6.30) 
implying that $ is unitary: S'S = Î. Taking matrix elements of this gives us the result 
NO StS = ba. (6.31) 
k 


Putting i = f in (6.31) yields >, |Ski|? = 1, which confirms that the expansion coefficients 
in (6.29) must obey the usual condition that the sum of all the partial probabilities must add 
up to 1. Note, however, that in the present case the states involved may contain different 
numbers of particles. 

We set up a perturbation-theory approach to calculating S' as follows. Integrating (6.25) 
subject to the condition at t + —oo yields 


W= li) -if ROWE Ji dt. (6.32) 


This is an integral equation in which the unknown |y(t)}r is buried under the integral on the 
right-hand side, rather similar to the one we encounter in non-relativistic scattering theory 
(equation (H.12) of appendix H). As in that case, we solve it iteratively. If H/ is neglected 
altogether, then the solution is 


I(t) = li). (6.33) 


To get the first order in H! correction to this, insert (6.33) in place of |7(t’)); on the 
right-hand side of (6.32) to obtain 


wen? =i + f (iĝi (t1) at |i) (6.34) 


recalling that |i) is a constant state vector. Putting this back into (6.32) yields |w(t)) correct 
to second order in Hy: 


WO)? = fia f itean 


—co 


+ fats f? citei} (6.35) 
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which is as far as we intend to go. Letting t + oo then gives us our perturbative series for 
the S'-operator: 


= f (—i f! (t1)) dty J dt, f. dtə (~if! (t1)) (if! (t2)) +- (6.36) 


—co 


the dots indicating the higher-order terms, which are in fact summarized by the full formula 


x co foe) ty bear n 7 . 
szy" f an f da f dtn A! (t1) Hi (te)... Hi (tn). (6.37) 
n=0 =o —co —co 
We could immediately start getting to work with (6.37), but there is one more useful 
technical adjustment to make. Remembering that 


Hi(t) = J Ai (x,t) dx (6.38) 
we can write the second term of (6.36) as 


TJ , dtz dtro (—iH}(x1))(—iH}(x2)) (6.39) 


which looks much more symmetrical in æ — t. However, there is still an awkward asymmetry 
between the x-integrals and the t-integrals because of the tı > t2 condition. The t-integrals 
can be converted to run from —oo to oo without constraint, like the a ones, by a clever 
trick. Note that the ordering of the operators A, is significant (since they will contain non- 
commuting bits), and that it is actually given by the order of their time arguments, ‘earlier’ 
operators appearing to the right of ‘later’ ones. This feature must be preserved, obviously, 
when we let the t-integrals run over the full infinite domain. We can arrange for this by 
introducing the time-ordering symbol T, which is defined by 


T(Aile)Ailz2)) Aile Aile) for ty > ta 
= Ailz A(z) forti < te (6.40) 


and similarly for more products, and for arbitrary operators. Then (see problem 6.1) (6.39) 
can be written as 


1 J / Hadar enee (6.41) 


where the integrals are now unrestricted. Applying a similar analysis to the general term 
gives us the Dyson expansion of the S operator: 


n 


D 


This fundamental formula provides the bridge leading from the Tomonaga-Schwinger 
equation (6.25) to the Feynman amplitudes (Feynman 1949a, b), as we shall see in detail 
in section 6.3.2 for the ‘ABC’ case. 


/ E / d4xy d4ay ...d4am TE (ar )H (aa) Th (an)}. (6.42) 


6.3 Applications to the ‘ABC’ theory 


As previously explained, the simple self-interacting be theory is not respectable. Following 
Griffiths (2008) we shall instead apply the foregoing covariant perturbation theory to a 
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hypothetical world consisting of three distinct types of scalar particles A, B, and C, with 
masses ma, Mp, and mc, respectively. Each is described by a real scalar field which, if free, 
would obey the appropriate KG equation; the interaction term is gadeg. We shall from 
now on omit the IP subscript ‘I’, since all operators are taken to be in the IP. Thus the 
Hamiltonian is 


A = Ĥo + Î' (6.43) 
where 
m=; > frè + (Vei)? +m? ¢?] Ba (6.44) 
i=A,B,C 
and 
Pes J pce eres J eH. (6.45) 


Each field ĝ;, (i = A,B,C) has a mode expansion of the form (5.143), and associated 
creation and annihilation operators al and â; which obey the commutation relations 


[ai(k), at (k’)] = (20)°°(k—k')5;; i,j = A,B,C. (6.46) 
The new feature in (6.46) is that operators associated with distinct particles commute. In 
a similar way, we also have [4;,4;] = fâ}, ai] = 0. 


6.3.1 The decay C> A+B 


As our first application of (6.42), we shall calculate the decay rate (or resonance width) for 
the decay C — A+B, to lowest order in g. Admittedly this is not yet a realistic, physical, 
example; even so, the basic steps in the calculation are common to more complicated physical 
examples, such as W~ > e + De. 

We suppose that the initial state |i) consists of one C particle with 4-momentum pc, 
and that the final state in which we are interested is that with one A and one B particle 
present, with 4-momenta pa and pp respectively. We want to calculate the matrix element 


Ss = (pa, pp|$|pc) (6.47) 


to lowest order in g. (Note that the ‘1’ term in (6.36) cannot contribute here because the 
initial and final states are plainly orthogonal.) This means that we need to evaluate the 
amplitude 


AM = ~ig(pa, pel j d'z ĝa (z)ĝn(z)ĝc(2)lpo). (6.48) 


To proceed we need to decide on the normalization of our states |p;). We will define (for 
i = A,B,C) 


lpi) = V2E;â} (p:)10) (6.49) 
where E; = ym? + p?, so that (using (6.46)) 
(pi|pi) = 2E;(2m)°5°(p; — pj). (6.50) 


The quantity £;63(pi,— p;) is Lorentz invariant. Note that the completeness relation for 
such states reads 


3 ; 
Jol (6.51 
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where the ‘1’ on the right-hand side means the identity in the subspace of such one-particle 

states, and zero for all other states. The normalization choice (6.49) corresponds (see com- 

ment (5) in section 5.2.5) to a wavefunction normalization of 2E; particles per unit volume. 
Consider now just the ¢c(2)|pc) piece of (6.48). This is 


3 
J F A O + âb (k)e**] /2Ecâ} (pc)|0) (6.52) 


where k = (Ep, k) and Ek = 4/k? + m2. The term with two âț’s will give zero when 
bracketed with a final state containing no C particles. In the other term, we use (6.46) 
together with Gco(k)|0) = 0 to reduce (6.52) to 


dk 1 ; : 
a = (2r)? (po — k)\/2Ece **|0) = e PO |0 6.53 
J em Fem" Po — b)VBBe™*0) 0) (6.53) 
where po = (\/pé + mè, pc). In exactly the same way we find that, when bracketed with 
an initial state containing no A’s or B’s, 


(pa, pBlĝa (z)ĝB (T1) = (Ole teire 2, (6.54) 


Hence the amplitude (6.48) becomes just 
AW = —ig I d4gelPatrs—pc)* — _ig(2r)454(pa + pp — po). (6.55) 


Unsurprisingly, but reassuringly, we have discovered that the amplitude vanishes unless the 
4-momentum is conserved via the 6-function condition: po = pa + pp. 

It is clear that such a transition will not occur unless mc > ma + mp (in the rest frame 
of the C, we need mc = y må + p? + \/m?2, + p°), so let us assume this to be the case. We 
would now like to calculate the rate for the decay C + A + B. To do this, we shall adopt 
a plausible generalization of the ordinary procedure followed in quantum mechanical time- 
dependent perturbation theory (the reader may wish to consult section H.3 of appendix H 
at this point, to see a non-relativistic analogue). The first problem is that the transition 


probability AW |? apparently involves the square of the four-dimensional 6-function. This 
is bad news since (to take a simple case, and using (E.53)) (x — a)ô(x — a) = 6(a — a)6(0) 
and 6(0) is infinite. In our case we have a four-fold infinity. This trouble has arisen because 
we have been using plane-wave solutions of our wave equation, and these notoriously lead 
to such problems. A proper procedure would set the whole thing up using wave packets, 
as is done, for instance, in Peskin and Schroeder (1995), section 4.5. An easier remedy is 
to adopt ‘box normalization’, in which we imagine that space has the finite volume V, and 
the interaction is turned on only for a time T. Then ‘(27)*6+(0)’ is effectively ‘VT’ (see 
Weinberg (1995, section 3.4)). Dividing this factor out, the transition rate per unit volume 
is then 


Py = |AMP/VT = (27)454(pa + ps — po)|Mal? (6.56) 
where (cf (6.55)) 
AW = (2)464(pa + pp — po)iMa (6.57) 


so that the invariant amplitude iMg is just —ig, in this case. 

Equation (6.56) is the probability per unit time for a transition to one specific final 
state |f). But in the present case (and in all similar ones with at least two particles in the 
final state), the A + B final states form a continuum, and to get the total rate IT we need 
to integrate Þa over all the continuum of final states, consistent with energy-momentum 
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conservation. The corresponding differential decay rate dI is defined by dr = P;dN¢ where 
dN; is the number of final states, per particle, lying in a momentum space volume d?p,d? pg 
about pa and pp. For the normalization (6.49), this number is 


dpa pp 
(27)32E, (27)32Eg ` 


dN; = (6.58) 


Finally, to get a normalization-independent quantity we must divide by the number of 
decaying particles per unit volume, which is 2Ec. Thus our final formula for the decay rate 
is 

d°pp 
eae (2r)32Eg 


p= far = zgr) f (pa + pn - po)|Mal? (6.59) 
Note that the ie 2E” factors are Lorentz invariant (see the exercise in appendix E) and 
so are all the other terms in (6.59) except Ec, which contributes the correct Lorentz- 
transformation character for a rate (i.e. rate x 1/7). 

We now calculate the total rate T in the rest frame of the decaying C particle. In this 
case, the 3-momentum part of the ôt gives pa + pp = 0, so pa = P = —Pp, and the energy 
part becomes 6(£ — mc) where 


E= /m3 +p? +m} +p? = Ea + Ep. (6.60) 
So the total rate is 5 q3 
1 g p 
F= ôE : 6.61 
Ime Cr)? f Taa (GE 
Differentiating (6.60) we find 
lpl _ |p |p| E 
E= | =— + = = ; .62 
az= (+B) ap = PE ap (6.62) 
Thus we may write 
3 2 Ea Ep 
d'p = 4r|p|" d|p| = 4rlp|— p dE (6.63) 


and use the energy 6-function in (6.61) to do the dE integral yielding finally 


9 pl 
87 me, 


T= (6.64) 


The quantity |p| is actually determined from (6.60) now with E = mq; after some algebra, 
we find (problem 5.2) 


|p| = [m4 + mi + mé — 2m4 m2, — 2m2.m?2, — 2m2,m4]/?/2mc. (6.65) 


Equation (6.64) is the result of an ‘almost real life’ calculation and a number of com- 
ments are in order. First, consider the question of dimensions. In our units A = c= 1,T as 
an inverse time should have the dimensions of a mass (see appendix B), which can also be 
understood if we think of [ as the width of an unstable resonance state. This requires ‘g’ 
to have the dimensions of a mass, i.e. g~ M in these units. Going back to our Hamiltonian 
(6.44) and (6.45), which must also have dimensions of a mass, we see from (6.44) that the 
scalar fields 6; ~ M (using d?a ~ M~%), and hence from (6.45) g ~ M as required. It turns 
out that the dimensionality of the coupling constants (such as g) is of great significance in 
quantum field theory. In QED, the analogous quantity is the charge e, and this is dimen- 
sionless in our units (œ = e? /4r = 1/137, see appendix C). However, we saw in (1.31) that 


Applications to the ‘ABC’ theory 135 


Fermi’s ‘four-fermion’ coupling constant G had dimensions ~ M~?, while Yukawa’s ‘gn’ 
and ‘g” (see figure 1.4) were both dimensionless. In fact, as we shall explain in section 11.8, 
the dimensionality of a theory’s coupling constant is an important guide as to whether the 
infinities generally present in the theory can be controlled by renormalization (see chapter 
10) or not: in particular, theories in which the coupling constant has negative mass dimen- 
sions, such as the ‘four-fermion’ theory, are not renormalizable. Theories with dimensionless 
coupling constants, such as QED, are generally renormalizable, though not invariably so. 
Theories whose coupling constants have positive mass dimension, as in the ABC model, 
are ‘super-renormalizable’, meaning (roughly) that they have fewer basic divergences than 
ordinary renormalizable theories (see section 11.8). 

In the present case, let us say that the mass of the decaying particle ma, ‘sets the scale’ 
for g, so that we write g = gmc and then 


g? 
T=% ]pl (6.66) 
T 


where g is dimensionless. Equation (6.66) shows us nicely that I is simply proportional to 
the energy release in the decay, as determined by |p| (one often says that T is determined 
‘by the available phase space’). If mc is exactly equal to ma + mp, then |p| vanishes and so 
does I. At the opposite extreme, if ma and mg are negligible compared to mc , we would 
have 


T= me. (6.67) 


Equation (6.67) shows that, even if g?/167 is small (~ 1/137 say) T can still be surprisingly 
large if mc is itself large, as in WT — e7 + De for example. 


6.3.2 A +B —> A + B scattering: the amplitudes 


We now consider the two-particle — two-particle process 
A+B>A+B (6.68) 


in which the initial 4-momenta are pa, pp and the final 4-momenta are ph, phg so that 
pa + pp =p’, + Pp. Our main task is to calculate the matrix element (p',,0'3|5'pa, pB) to 
lowest non-trivial order in g. The result will be the derivation of our first ‘Feynman rules’ 
for amplitudes in perturbative quantum field theory. 

The first term in the $-operator expansion (6.42) is ‘1’, which does not involve g at all. 
Nevertheless, it is a useful exercise to evaluate and understand this contribution (which in 
the present case does not vanish), namely 


(laa (Pa)ân ()44 (pa Jâ} (pp)|0) (16 Za EB E4 Bp)”. (6.69) 


We shall have to evaluate many such vacuum expectation values (vev) of products of ât’s 
and @’s. The general strategy is to commute the âf’s to the left, and the G’s to the right, 
and then make use of the facts 

(o|a! = a,|0) = 0 (6.70) 


for any i = A,B,C. Thus, remembering that all ‘A’ operators commute with all ‘B’ ones, 
the vev in (6.69) is equal to 
(O\aa (p'a Jâ] (pa){ (27)°53 (Pe — Pp) + âh (pB )âB (P's) }10) 
= (0{(27)?8 (pa — py) + âi (pa)aa (py) }(2m)°5? (pg — pp)|0) 
= (2r)*8 (pa — Pa) (27)*8 (Dp — PB). (6.71) 
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FIGURE 6.1 
The order g? term in the perturbative expansion: the two particles do not interact. 


The 6-functions enforce Fa = Eh and Eg = Ep so that (6.69) becomes 
2E4(2m)*5°(py — py)2Ep(277)°5* (pp — pp) (6.72) 


a result which just expresses the normalization of the states, and the fact that, with no 
‘g’ entering, the particles have not interacted at all, but have continued on their sepa- 
rate ways, quite unperturbed (py = P^, Pp = Ph). This contribution can be represented 
diagrammatically as figure 6.1. 

Next, consider the term of order g, which we used in C > A + B. This is 


~ig J d*s (ph, palda(2)dn()}c(x)lpa, pe). (6.73) 


We have to remember, now, that all the ĝi operators are in the interaction picture and 
are therefore represented by standard mode expansions involving the free creation and 
annihilation operators al and @;, i.e. the same ones used in defining the initial and final 
state vectors. It is then obvious that (6.73) must vanish, since no C-particle exists in either 
the initial or final state, and (0|¢c|0) = 0. 

So we move on to the term of order g?, which will provide the real meat of this chapter. 
This term is 


CD” S [ater ates (Olan (phân loh) 
xT {da (21) bp (21)¢0(21)ba(#2)bx(#2)b0(x2)} 
x @, (pa) ab (pp)|0) (16La Ep Eh Eh)". (6.74) 


The vev here involves the product of ten operators, so it will pay us to pause and think 
how such things may be efficiently evaluated. 
Consider the case of just four operators 
(0| ABCD|0) (6.75) 
where each of A, B, C. D is an âi, an at or a linear combination of these. Let A have the 
â+ ât. Then (using (Olat = a|0) = 0) 


(0|ABCD|0) 


I 
S 
Qa 
& 
> > 
D 
g 


(O|[a, BC'D]|0). (6.76) 
Now it is an algebraic identity that 


(a, BCD] = [â, BICD + Bla, C|D + BC{a, Ô]. (6.77) 
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FIGURE 6.2 
C-quantum propagating (a) for tı > t2 (from x2 to zı) and (b) tı < t2 (from zı to z2). 


Hence 
(O0|ABCDI|0) = [4, B)(0|CDI0) + fâ, C](0| BDO) + [a, D](0|BC|0), (6.78) 


remembering that all the commutators—if non-vanishing—are just ordinary numbers (see 
(6.46)). We can rewrite (6.78) in more suggestive form by noting that 


fâ, B] = (0|[a, B]|0) = (0|4B)0) = (0|ABJo). (6.79) 


Thus the vev of a product of four operators is just the sum of the products of all the possible 
pairwise ‘contractions’ (the name given to the vev of the product of two fields): 


(0. ABCD|0) = (0| AB|0)(0|C D0) + (0| AC|0) (0| BDO) + (0| AD|0)(0|BC|0). (6.80) 


This result generalizes to the vev of the product of any number of operators; there is 
also a similar result for the vev of time-ordered products of operators, which is known as 
Wick’s theorem (Wick 1950), and is indispensable for a general discussion of quantum field 
perturbation theory. 

Consider then the application of (6.80), as generalized to ten operators, to the vev in 
(6.74). The only kind of non-vanishing contractions are of the form (0|4;4!|0). Thus the 
contractions of A-, B-, and C-type operators can be considered separately. As far as the 
C-operators are concerned, then, we can immediately conclude that the only surviving 
contraction is 


(0|T(¢c(#1)bc(#2))|0). (6.81) 


This quantity is, in fact, of fundamental importance: it is called the Feynman propagator 
(in coordinate space) for the spin-0 C-particle. We shall derive the mathematical formula 
for it in due course, but for the moment let us understand its physical significance. Each of 
the dc’s in (6.81) can create or destroy C-quanta, but for the vev to be non-zero anything 
created in the ‘initial’ state must be destroyed in the ‘final’ one. Which of the times tı and 
tə is initial or final is determined by the T-ordering symbol: for tı > t2, a C-quantum is 
created at x2 and destroyed at xı, while for tı < t2 a C-quantum is created at x, and 
destroyed at x2. Thus the amplitude (6.81) may be represented pictorially as in figure 6.2, 
where time increases to the right, and the vertical axis is a one-dimensional version of three- 
dimensional space. It seems reasonable, indeed, to call this object the ‘propagator’, since it 
clearly has to do with a quantum propagating between two space-time points. 

We might now worry that this explicit time-ordering seems to introduce a Lorentz non- 
invariant element into the calculation, ultimately threatening the Lorentz invariance of the 
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S-operator (6.42). The reason that this is, in fact, not the case exposes an important prop- 
erty of quantum field theory. If the two points xı and a2 are separated by a time-like interval 
(ie. (21 — 22)? > 0), then the time-ordering is Lorentz invariant; this is because no proper 
Lorentz transformation can alter the time-ordering of time-like separated events (here, the 
events are the creation/annihilation of particles/anti-particles at xı and x2). By ‘proper’ is 
meant a transformation that does not reverse the sense of time; the behaviour of the theory 
under time-reversal is a different question altogether, discussed earlier in section 4.2.4. The 
fact that time-ordering is invariant for time-like separated events is what guarantees that we 
cannot influence our past, only our future. But what if the events are space-like separated, 
(x; — 22)? < 0? We know that the scalar fields ¢;(a1) and ¢;(a2) commute for equal times: 
remarkably, one can show (problem 5.6(b)) that they also commute for (a, — x2)? < 0; 
so in this sector of 2; — x2 space the time-ordering symbol is irrelevant. Thus, contrary to 
appearances, the T-product vev is Lorentz invariant. For the same reason, the ô operator 
of (6.42) is also Lorentz invariant: see, for example, Weinberg (1995, section 3.5). 
The property i . 
[0;(21), d:(22)] =0 for (xı = r2)? <0 (6.82) 


has an important physical interpretation. In quantum mechanics, if operators representing 
physical observables commute with each other, then measurements of either observable can 
be performed without interfering with each other; the observables are said to be ‘compatible’. 
This is just what we would want for measurements done at two points which are space- 
like separated—no signal with speed less than or equal to light can connect them, and so 
we would expect them to be non-interfering. Condition (6.82) is often called a ‘causality’ 
condition. 

More mathematically, the amplitude (6.81) is in fact a Green function for the KG oper- 
ator (O0 + mz,)! (see appendix G, and problem 6.3). That is to say, 


x +m%)(0|T(do(#1)bc(a2))|0) = —ið (zı — x2). (6.83) 


Actually, problem 6.3 shows that (6.83) is true even when the (0| and |0) are removed, 
i.e. the operator quantity T(dc(x1)¢c(x2)) is itself a KG Green function. The work of 
appendices G and H indicates the central importance of such Green functions in scattering 
theory, so we need not be surprised to find such a thing appearing here. 

Now let us figure out what are all the surviving terms in the vev in (6.74). As far as 
contractions involving âa (p’,) are concerned, we have only three non-zero possibilities: 


(Ol@a(P'y)@4(pa)|0) — (Olaa(Wy)@a(ar)|0) (04a (4) ba (w2)/0). (6.84) 


— 


There are similar possibilities for al (pa), ap (pR), and âb (pg). The upshot is that we have 
only the following pairings to consider: 


(Olâa (p'a )â$ (pa)|0) (0lâs (pg) 4k, (pB)|0) 

x(0|T (ĝa (z1)ĝa (22))10) (0|T (ĝe (£1)B (£2))10) (O/T ($o (x1)ĝc(x2))10); 

(6.85) 

(Olâa (p'a )â$ (pa)|0) (0lâs (pig) x (221) |0) 

x (0|ĝB (z2)â} (pp)|0) (0|P(dc(1)$c(x2))|0) 0T ($a (#1) ba (a2))|0) 

+h © £2; (6.86) 
(O\ap (pig) 44 (pp) |0) (O|aa (P'a ) ba (x1)|0) 

x (O|a (a2) (pa)|0)(0|T'(dc(a1)$c(x2))|0) (0|T (s (221) bp (a2))|0) 

+21 T2; (6.87) 
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(Ola (p's) ba (#1)|0) (0|ĝa (x2)â (pa) |0) (0lâs (p'y) On (a1) |0) 

x (0|bn (x2)â} (pp) |0) (O/T ($c (x1)bc(x2))|0) 

+21 © £9; (6.88) 
(Ola (py) Ga (21)|0) (Oba (w2)@4 (pa)|0) (0låâs (Pg) on (x2)|0) 

x (0|dp(21)a4 (pp) |0) (0|T(dc(#1)$c(x2))|0) 

+21 T2. (6.89) 


We already know that quantities like (0|a(p', ah, (pa)|0) yield something proportional to 
(pa — p's) and correspond to the initial A-particle going ‘straight through’. The other 
factors in (6.86) — (6.89) which are new are quantities like (0|4@, (p'a )¢a(x1)|0), which has 
the value (problem 6.4) 


(Ola (pa)ĝa (21)10) = eam (6.90) 
2B) 
which is proportional (depending on the adopted normalization) to the wavefunction for an 
outgoing A-particle with 4-momentum p’ . 

We are now in a position to give a diagrammatic interpretation of all of (6.85)—(6.89). In 
these diagrams, we shall not (as we did in figure 6.2) draw two separately time-ordered pieces 
for each propagator. We shall not indicate the time-ordering at all and we shall understand 
that both time-orderings are always included in each propagator line. Term (6.85) then has 
the structure shown in figure 6.3(a); term (6.86) that shown in figure 6.3(b); term (6.87) 
that in figure 6.3(c); term (6.88) that in figure 6.3(d); and term (6.89) that in figure 6.3(e). 
We recognize in figure 6.3(e) the long-awaited Yukawa exchange process, which we shall 
shortly analyse in full—but the formalism has yielded much else besides! We shall come 
back to figures 6.3(a), (b), and (c) in section 6.3.5; for the moment we note that these 
processes do not represent true interactions between the particles, since at least one goes 
through unscattered in each case. So we shall concentrate on figures 6.3(d) and (e), and 
derive the Feynman rules for them. 

First, consider figure 6.3(e), corresponding to the contraction (6.89). When this is in- 
serted into (6.74), the two terms in which x; and zə are interchanged give identical results 
(interchanging xı and zə in the integral), so the contribution we are discussing is 


(—ig)? I dîzydízse PaPe) 21 eil Pp -Pa) 22 (0|T (¢c(21)¢c(x2))|0). (6.91) 


We must now turn our attention, as promised, to the propagator of (6.81), 
(0|T'(¢c(«1)¢c(«2))|0). Inserting the mode expansion (6.52) for each of dc (x1) and ¢c(x2), 
and using the commutation relations (6.46) and the vacuum conditions (6.70) we find (prob- 
lem 6.5) 


g ` dk l k 
T = —_~ f(t, — to ew ive (tata) tik: (L1—- Lo) 
(O17 Go(er)¥cle))10) =f Stt -t 
46(ty — ty ewe (tata) tik (@2—#1)) (6.92) 


where wp = (k? +m?2,)'/?. This expression is very ‘uncovariant looking’, due to the presence 
of the 6-functions with time arguments. But the earlier discussion, after (6.81), has assured 
us that the left-hand side of (6.92) must be Lorentz invariant, and—by a clever trick—it is 
possible to recast the right-hand side in manifestly invariant form. We introduce an integral 
representation of the @-function via 


a(t) =i I poa (6.93) 


œ 2T z + ie 
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Graphical representation of (6.85)—(6.89): (a) (6.85); (b) (6.86); (c) (6.87); (d) (6.88); (e) 
(6.89). 


where € is an infinitesimally small positive quantity (see appendix F). Multiplying (6.93) 
by e i”*t and changing z to z — wp in the integral we have 


co dz eit 


—iwkt _ : ka 
O(t)e =i res, (6.94) 
Putting (6.94) into (6.92) then yields 
. . d3kdz (en iz(ti—te)+ik-(@1-X2) 
oT 0) = i 
(ITGo(a)deles))0) = if ere 4 
eiz(ti—ta)—ik- (@1-X2) 
+ : \ (6.95) 
z — (wp — ie) 


The exponentials and the volume element demand a more symmetrical notation. Let us 
write ko = z so that (ko = z, k) form the components of a 4-vector kt. Note very carefully, 
however, that ko is not (k? + m2,)'/?! The variable ko is unrestricted, whereas it is wp that 
equals (k? + m2,)!/?. With this change of notation, (6.95) becomes 


y zi dźk i —ik-(xı—z2) ik-(@1—22) 
Oole = | Ss | 5 + as} 


Changing k + —k (ko + —ko, k — —k) in the second term in (6.96), we finally have 


(6.96) 


sl dk eik (1-22) i l 1 
-J (27) 2wp | ko— (wp — ie) ko+wpk—ie 


dk ar i 
= — " omik (sı rə) _—_— = 
f Or ke — (wk — ie)?’ Eor) 


1We know that the left-hand side of (6.95) is Lorentz invariant, and that (tı — t2,a@1 — x2) form the 
components of a 4-vector. The quantities (ko = z, k) must also form the components of a 4-vector, in order 
for the exponentials in (6.95) to be invariant. 
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FIGURE 6.4 
Momentum-space Feynman diagram corresponding to the O(g?) amplitude of (6.100). 


or 


i 


d*k —ik-(@1—22) 
(27)* k2 — k? — m2, + ie 


(OlT ($c (æ1)ĝc(22))10) = / (6.98) 


where in the last step we have used w? = k? + m2, and written ‘ie’ for ‘2ie’ since what 
matters is just the sign of the small imaginary part (note that wx is defined as the positive 
square root). In this final form, the Lorentz invariance of the scalar propagator is indeed 
manifest. 

We shall have more to say about this propagator (Green function) in section 6.3.3. For 
the moment we simply note two points: first, it is the Fourier transform of i/k? — me + ie, 
as stated in appendix G, where k? = k? — k?; and second, it is a function of the coordinate 
difference xı — x2, as it has to be since we do not expect physics to depend on the choice of 
origin. This second point gives us a clue as to how best to perform the x; — x2 integral in 
(6.91). Let us introduce the new variables x = zı — £2, X = (xı +22)/2. Then (problem 6.6) 
(6.91) reduces to 


dtk —ik-x i 
(2n)4° k? — m2, + ie 
(6.99) 


(—ig)?(2m)*5*(pa + PB — PA - Pp) I dtz ere f 


i 


= (—ig)?° (27) (pa + pB — Ph — Pp) = (6.100) 


q? — m2, + ie 
where q = pa — Pp = P^ — PB is the 4-momentum transfer carried by the exchanged C- 
quantum in figure 6.4, and we have used the four-dimensional version of (E.26). We associate 
this single expression, which includes the two coordinate space processes of figure 6.2, with 
the single momentum—space Feynman diagram of figure 6.4. The arrows refer merely to the 
flow of 4-momentum, which is conserved at each ‘vertex’ (i.e. meeting of three lines). Thus 
although the arrow on the exchanged C-line is drawn as indicated, this has nothing to do 
with any presumed order of emission/absorption of the exchanged quantum. It cannot do 
so, after all, since in this diagram the states all have definite 4-momentum and hence are 
totally delocalized in space-time; equivalently, we recall from (6.91) that the amplitude in 
fact involves integrals over all space-time. 
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Pa 


FIGURE 6.5 
Momentum-space Feynman diagram corresponding to the O(g?) amplitude of (6.101). 


A similar analysis (problem 6.7) shows that the contribution of the contractions (6.88) 
to the S-matrix element (6.74) is 


i 


+ \2 454 1 / 
(—ig)"(27)"0"(pa + pB — P'a M parm TE (6.101) 
which is represented by the momentum-space Feynman diagram of figure 6.5. 

At this point we may start to write down the Feynman rules for the ABC theory, 
which enable us to associate a precise mathematical expression for an amplitude with a 
Feynman diagram such as figure 6.4 or figure 6.5. It is clear that we will always have a 
factor (27)46+(pa + pp — ph — ph) for all ‘connected’ diagrams, following from the flow of 
the conserved 4-momentum through the diagrams. It is conventional to extract this factor, 
and to define the invariant amplitude Mg, via 


Sa = ôa + i(2r)*64 (pe — pi)Ma (6.102) 


in general (cf (6.57)). The rules reconstruct the invariant amplitude iM, corresponding to 
a given diagram, and for the present case they are: 


(i) At each vertex, a factor —ig. 


(ii) For each internal line, a factor 


i 

— 6.103 
q? — m? + ie ( ) 
where i = A,B, or C and q is the 4-momentum carried by that line. The factor 
(6.103) is the Feynman propagator in momentum space, for the scalar particle ‘2’. 


Of course, it is no big deal to give a set of rules which will just reconstruct (6.100) and 
(6.101). The real power of the ‘rules’ is that they work for all diagrams we can draw by 
joining together vertices and propagators (except that we have not yet explained what to do 
if more than one particle appears ‘internally’ between two vertices, as in figures 6.3(a)—(c): 
see section 6.3.5). 


6.3.3 A+B — A +B scattering: the Yukawa exchange mechanism, s 
and u channel processes 


Referring back to section 1.3.3, equation (1.28), we see that the amplitude for the exchange 
process of figure 6.4 indeed has the form suggested there, namely ~ g?/(q? — mé) if C is 
exchanged. We have seen how, in the static limit, this may be interpreted as a Yukawa inter- 
action of range h/mcc between the particles A and B, treated in the Born approximation. 
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Expression (6.100), then, provides us with the correct relativistic formula for this Yukawa 
mechanism. 

There is more to be said about this fundamental amplitude (6.100), which is essentially 
the C propagator in momentum space. While it is always true that p? = m? for a free 
particle of 4-momentum p; and rest mass m;, it is not the case that q? = m2, in (6.100). We 
emphasized after (6.95) that the variable ko introduced there was not equal to (k? +m2,)!/?, 
and the result of the step (6.99) — (6.100) was to replace ko by go and k by q, so that 
qo # (q? + m2,)'/?, i.e. q? = q8 — q? # md. So the exchanged quantum in figure 6.4 does 
not satisfy the ‘mass-shell condition’ p? = m?; it is said to be ‘off-mass shell’ or ‘virtual’ 
(see also problem 6.8). It is quite a different entity from a free quantum. Indeed, as we saw 
in more elementary physical terms in section 1.3.2, it has a fleeting existence, as sanctioned 
by the uncertainty relation. 

This ’shell’ terminology requires a word of explanation. We may regard the condition 
qê — q? = mé, as defining a surface in four-dimensional momentum space. Since this is hard 
to visualize, let us suppose that we have just two spatial dimensions, so that the condition 
becomes qj — q% — q} = me. This is a hyperboloid in (qo, dz, dy) space. If we take qo to be 
the vertical axis, the surface will extend above the point qq = mc for a physical particle of 
positive energy. It is this bowl-like surface which is the ‘shell’. 

We add one more comment about (6.103). In the language of complex variable theory 
introduced in Appendix F, the propagator has a singularity (it goes to infinity) at the point 
q? = m? — ie, which is a simple pole. As e > 0, this means that it has a pole at the on-shell 
point q? = m?. We may, indeed, define the (squared) mass of a particle as the position of 
the pole in the corresponding propagator. 

It is convenient, at this point, to introduce some kinematic variables which will appear 
often in following chapters. These are the ‘Mandelstam variables’ (Mandelstam 1958, 1959) 


s=(pat+ pp)” t= (pa -= ph) u = (pa — pp). (6.104) 


They are clearly relativistically invariant. In terms of these variables the amplitude (6.100) 
is essentially ~ 1/(u—mé,+ie), and the amplitude (6.101) is ~ 1/(s — m2, + ie). The first is 
said to be a ‘u-channel process’, the second an ‘s-channel process’. Amplitudes of the form 
(t —m?)~! or (u—m?)~? are basically one-quantum exchange (i.e. ‘force’) processes, while 
those of the form (s — m2,)~' have a rather different interpretation, as we now discuss. 

Let us first ask: can s = (pa + pp)? ever equal mŝ in (6.101)? Since s is invariant, we 
can evaluate it in any frame we like, for example the centre-of-momentum (CM) frame in 
which 

(pa + pp)? = (Ea + Ep)? (6.105) 
with Ea = (m + p?)/?, Eg = (m2, + p?)'/?. It is then clear that if mo < ma + mpg the 
condition (pa + pp)” = mê can never be satisfied, and the internal quantum in figure 6.5 is 
always virtual (note that pa + pg is the 4-momentum of the C-quantum). Depending on the 
details of the theory with which we are dealing, such an s-channel process can have different 
interpretations. In QED, for example, in the process et + e7 — et + e7 we could have a 
virtual y s-channel process as shown in figure 6.6. This would be called an ‘annihilation 
process’ for obvious reasons. In the process y +e” — y +e , however, we could have 
figure 6.7 which would be interpreted as an absorption and re-emission process (i.e. of a 
photon). 

However, if mc > ma + mp, then we can indeed satisfy (pa + pp)? = mê, and so 
(remembering that € is infinitesimal) we seem to have an infinite result when s (the square 
of the CM energy) hits the value mz. In fact, this is not the case. If mc > ma + mp, 
the C-particle is unstable against decay to A+B, as we saw in section 6.3.1. The s-channel 
process must then be interpreted as the formation of a resonance, i.e. of the transitory 
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FIGURE 6.6 
O(e?) contribution to ete~ — ete via annihilation to (and re-emission from) a virtual y 
state. 


and decaying state consisting of the single C-particle. Such a process would be described 
non-relativistically by a Breit-Wigner amplitude of the form 


M x 1/(E — Er + iT /2) (6.106) 


which produces a peak in |M]|? centred at E = Ep and full width F at half-height; T is, in 
fact, precisely the width calculated in section 6.3.1. The relativistic generalization of (6.106) 
is i 
M — 6.107 
* s- M? +iMT Aen 
where M is the mass of the unstable particle. Thus in the present case the prescription for 
avoiding the infinity in our amplitude is to replace the infinitesimal ‘ie’ in (6.101) by the 
finite quantity imcI, with T as calculated in section 6.3.1. We shall see examples of such 
s-channel resonances in section 9.5. 


6.3.4 A+B — A + B scattering: the differential cross section 


We complete this exercise in the ‘ABC’ theory by showing how to calculate the cross sec- 
tion for A+B—> A+B scattering in terms of the invariant amplitude Mg of (6.102). The 
discussion will closely parallel the calculation of the decay rate I in section 6.3.1. 

As in (6.56), the transition rate per unit volume, in this case, is 


Pa = (20)46*(pa + pa — p'a — ph)| Mal. (6.108) 


FIGURE 6.7 
O(e”) contribution to ye~ — ye~ via absorption to (and re-emission from) a virtual e7 
state. 
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In order to obtain a quantity which may be compared from experiment to experiment, 
we must remove the dependence of the transition rate on the incident flux of particles and 
on the number of target particles per unit volume. Now the flux of beam particles (‘A’ 
ones, let us say) incident on a stationary target is just the number of particles per unit area 
reaching the target in unit time which, with our normalization of ‘2E particles per unit 
volume’, is just 

\v|2Ba (6.109) 


where v is the velocity of the incident A in the rest frame of the target B. The number of 
target particles per unit volume is 2E'p (= 2mp for B at rest, of course). 

We must also include the ‘density of final states’ factors, as in (6.59). Putting all this 
together, the total cross section ø is given in terms of the differential cross section dø by 


1 
= r a 2 E 4 fame f > l 
lj fo 2Ep2E Aw T) fo (pa + pB — Pa — Pp) 
Pp, Ppp 
(2m)32E (r22 


x| Ms]? 


where we have introduced the Lorentz invariant phase space dLips(s; p'4 , pp) defined by 


dy, d?°ph 
Ex, Ep 


1 
ty? (Pa + pp — Pa — Pp) (6.111) 


dLips(s; ph, ph) = T 


We can write the flux factor for collinear collisions in invariant form using the relation 
(easily verified in a particular frame (problem 6.9)) 


E, Eplļv] = [(pa- pB)? — mmg]. (6.112) 


Everything in (6.110) is now written in invariant form. 
It is a useful exercise to evaluate f dø in a given frame, and the simplest one is the 
centre-of-momentum (CM) frame defined by 


Pa + Pe = Ph + Pp =O. (6.113) 


However, before specializing to this frame, it is convenient to simplify our expression for 
dLips. Using the 3-momentum part of the 6-function in (6.110), we can eliminate the integral 
over d?ph: 


d?ph 


El 6(Ea + Ep — Ef — Eh), (6.114) 


O'(pa + pp D's Pp) = 


remembering also that now ph has to be replaced by py +pp—p', in Mg. On the right-hand 
side of (6.114), pp and Ef are no longer independent variables but are determined by the 
conditions 


Pp=Pat+Pp-Py Eb = (m +p). (6.115) 
Next, convert d3p/, to angular variables 


dp, = p% d|p | dQ. (6.116) 


The energy E% is given by 
E, = (må + pl)? (6.117) 
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so that 
E, By = |p's| |p|. (6.118) 
With all these changes we arrive at the result (valid in any frame) 
1 |paldEy 
dLips(s; ph, pg) = (ame Ee dO 6(E,4 + Eg — Ex — Ep). (6.119) 


We now specialize to the CM frame for which py = p = —Ppp, p's = P' = —ph, and 
Ex =(mat+p?)/? Eb = (mi +p?) (6.120) 
so that 
E, dE = |p'| d|p'| = Ep dE. (6.121) 


Introduce the variable W” = E/, + Ep (note that W” is only constrained to equal the total 
energy W = Ea + Ep after the integral over the energy-conserving 6-function has been 
performed). Then (as in (6.62)) 


W'lp'|dip'| _ W’ 
E, Ep Ex 


dW’ = dE + dE% = dB, (6.122) 


where we have used (6.121) in each of the last two steps. Thus the factor 


dB, 


Pal- O(Ba + Ee — Eh — Ep) (6.123) 
B 
becomes dup 
PW — Ww) (6.124) 
which reduces to 
lp|/W 
after integrating over W”, since the energy-conservation relation forces |p’| = |p|. We arrive 
at the important result 
, 1 |p| 
dLips(s; ph, pp) = Un W dQ (6.125) 


for the two-body phase space in the CM frame. 
The last piece in the puzzle is the evaluation of the flux factor (6.112) in the CM frame. 
In the CM we have 


PA*PB = (Ex, p) ` (Eg, =p) (6.126) 
= E,Ea +p? (6.127) 


and a straightforward calculation shows that 
(pa - pp)” — mamp = p W’. 


Hence we finally have 


_ _ IPI 2 
o= |a= Tar caw | mal dQ (6.128) 


and the CM differential cross section is 


do 


ol = gya Ma. (6.129) 


CM E (8rW)? 
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FIGURE 6.8 
O(g*) contribution to the process A+B — A+B, in which a virtual transition 
C > A +B > C occurs in the C propagator. 


6.3.5 A+B — A + B scattering: loose ends 


We must now return to the amplitudes represented by figures 6.3(a)-(c), which we set 
aside earlier. Consider first figure 6.3(b). Here the A-particle has continued through without 
interacting, while the B-particle has made a virtual transition to the ‘A + C’ state, and then 
this state has reverted to the original B-state. So this is in the nature of a correction to the 
‘no-scattering’ piece shown in figure 6.1, and does not contribute to Mg. However, such a 
virtual transition B — A + C — B does represent a modification of the properties of the 
original single B state, due to its interactions with other fields as specified in Hf. We can 
easily imagine how, at order g*, an amplitude will occur in which such a virtual process 
is inserted into the C propagator in figure 6.4 so as to arrive at figure 6.8, from which it 
is plausible that such emission and reabsorption processes by the same particle effectively 
modify the propagator for this particle. This, in turn, suggests that part, at least, of their 
effect will be to modify the mass of the affected particle, so as to change it from the original 
value specified in the Lagrangian. We may think of this physically as being associated, in 
some way, with a particle’s carrying with it a ‘cloud’ of virtual particles, with which it 
is continually interacting; this will affect its mass, much as the mass of an electron in a 
solid becomes an ‘effective’ mass due to the various interactions experienced by the electron 
inside the solid. 

We shall postpone the evaluation of amplitudes such as those represented by fig- 
ures 6.3(b) and (c) to chapter 10. However, we note here just one feature: 4-momentum 
conservation applied at each vertex in figure 6.3(b) does not determine the individual 4- 
momenta of the intermediate A and C particles, only the sum of their 4-momenta, which 
is equal to pg (and this is equal to ph also, so indeed no scattering has occurred). It is 
plausible that, if an internal 4-momentum in a diagram is undetermined in terms of the 
external (fixed) 4-momenta of the physical process, then that undetermined 4-momentum 
should be integrated over. This is the case, as can be verified straightforwardly by evaluating 
the amplitude (6.86), for example, as we evaluated (6.89); a similar calculation will be gone 
through in detail in chapter 10, section 10.1.1. The corresponding Feynman rule is 
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(a) (b) 


FIGURE 6.9 
O(g*) disconnected diagrams in A +B —> A+B. 


(iii) For each internal 4-momentum k which is not fixed by 4-momentum conservation, 
carry out the integration f d4k/(27)*. One such integration with respect to an 
internal 4-momentum occurs for each closed loop. 


If we apply this new rule to figure 6.3(b), we find that we need to evaluate the integral 


dfk i i 
/ (27)4 (k2 — m?) (pp — k)? — m2) (6.130) 


which, by simple counting of powers of k in numerator and denominator, is logarithmically 
divergent. Thus we learn that, almost before we have started quantum field theory in earnest, 
we seem to have run into a serious problem, which is going to affect all higher-order processes 
containing loops. The procedure whereby these infinities are tamed is called renormalization, 
and we shall return to it in chapter 10. 

Finally, what about figure 6.3(a)? In this case nothing at all has occurred to either of the 
scattering particles, and instead a virtual trio of A + B + C has appeared from the vacuum, 
and then disappeared back again. Such processes are called, obviously enough, vacuum 
diagrams. This particular one is in fact only (another) correction to figure 6.1, and it makes 
no contribution to Mg. But as with figure 6.8, at O(g*) we can imagine such a vacuum 
process appearing ‘alongside’ figure 6.4 or figure 6.5, as in figures 6.9(a) and (b). These are 
called ‘disconnected diagrams’ and—since in them A and B have certainly interacted—they 
will contribute to Mg (note that they are in this respect quite different from the ‘straight 
through’ diagrams of figures 6.3(b) and (c)). However, it turns out, rather remarkably, that 
their effect is exactly compensated by another effect we have glossed over—namely the 
fact that the vacuum |0) we have used in our S-matrix elements is plainly the unperturbed 
vacuum (or ground state), whereas surely the introduction of interactions will perturb it. 
A careful analysis of this (Peskin and Schroeder 1995, section 7.2) shows that Mg is to be 
calculated from only the connected Feynman diagrams. 

In this chapter we have seen how the Feynman rules for scattering and decay amplitudes 
in a simple scalar theory are derived, and also how cross sections and decay rates are 
calculated. A Yukawa (u-channel) exchange process has been found, in its covariant form, 
and the analogous s-channel process, together with a hint of the complications which arise 
when loops are considered, at higher order in g. Unfortunately, however, none of this applies 
directly to any real physical process, since we do not know of any physical ‘scalar ABC’ 
interaction. Rather, the interactions in the Standard Model are all gauge interactions similar 
to electrodynamics (with the exception of the Higgs sector, which has both cubic and quartic 
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scalar interactions). The mediating quanta of these gauge interactions have spin-1, not zero; 
furthermore, the matter fields (again apart from the Higgs field) have spin-4. It is time to 
begin discussing the complications of spin and the particular form of dynamics associated 
with the ‘gauge principle’. 


Problems 


6.1 Show that, for a quantum field f (t) (suppressing the space coordinates), 


Faf dtz f(t) f = f anf dta T(f (t1) f(t2)) 


T(t) Pt) = f(t) f(t) — for ty > te 
f fti) for tg > tı. 


where 


6.2 Verify equation (6.65). 
6.3 Let $(x,t) be a real scalar KG field in one space dimension, satisfying 
a oO 
at? Ax? 


(+m = (Sa - Za +m?) dle.) =0. 


(i) Explain why 


T(o(x1, t1)O(@2,t2)) = (ts — te)b(x1, t1)b(x2, t2) 
+0(tz — t1) (a2, t2)d(a1, t1) 


(see equation (E.49) for a definition of the 6-function). 
(ii) Using equation (E.48), show that 


d 
qa —a) = ô(x — a). 


(iii) Using the result of (ii) with appropriate changes of variable, and equation (5.118), 
show that 


O x P 
Be, V Plen tijeltata))} 
= O(ty = t2)$(x1, t1) (x2, t2) + O(ta = t1) (x2, t2)b(a1, t1). 
(iv) Using (5.117) and (5.122) show that 
oy, 2 CE (erst) (a>. ta))} = ~i5(a1 — £2)ô(tı — te) + T((z1, t1) (x2, t2)) 


and hence show that 


ə? Ə? x p 
& 2 Bee + m?) T(b(a1,t1)0(a2, te)) = —id (a1 — w2)d(t1 — te). 
1 1 
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This shows that T(¢(«1, t1)$(%2, t2)) is a Green function (see appendix G, equa- 
tion (G.25)—the i is included here conventionally) for the KG operator 


The four-dimensional generalization is immediate. 
6.4 Verify (6.90). 
6.5 Verify (6.92). 
6.6 Verify (6.99) and (6.100). 


6.7 Show that the contribution of the contractions (6.88) to the S-matrix element (6.74) is 
given by (6.101). 


6.8 Consider the case of equal masses ma = mp = mc. Evaluate u of (6.104) in the CM 
frame (compare section 1.3.6), and show that u < 0, so that u can never equal m2, in 
(6.100). (This result is generally true for such single particle ‘exchange’ processes.) 


6.9 Verify (6.112). 


u 


Quantum Field Theory III: Complex Scalar Fields, 
Dirac and Maxwell Fields; Introduction of 
Electromagnetic Interactions 


In the previous two chapters we have introduced the formalism of relativistic quantum 
field theory for the case of free real scalar fields obeying the Klein-Gordon (KG) equation 
of section 3.1, extended it to describe interactions between such quantum fields and shown 
how the Feynman rules for a simple Yukawa-like theory are derived. It is now time to return 
to the unfortunately rather more complicated real world of quarks and leptons interacting 
via gauge fields—in particular electromagnetism. For this, several generalizations of the 
formalism of chapter 5 are necessary. 

First, a glance back at chapter 2 will remind the reader that the electromagnetic inter- 
action has everything to do with the phase of wavefunctions, and hence presumably of their 
quantum field generalizations. Fields which are real must be electromagnetically neutral. 
Indeed, as noted very briefly in section 5.3, the quanta of a real scalar field are their own 
anti-particles; for a given mass, there is only one type of particle being created or destroyed. 
However, physical particles and anti-particles have identical masses (e.g. e~ and e*), and 
it is actually a deep result of quantum field theory that this is so (see section 4.2.5 and 
the end of section 7.1). In this case for a given mass m, there will have to be two distinct 
field degrees of freedom, one of which corresponds somehow to the ‘particle’ and the other 
to the ‘anti-particle’. This suggests that we will need a complex field if we want to distin- 
guish particle from anti-particle, even in the absence of electromagnetism (for example, the 
(K®°, K?) pair). Such a distinction will have to be made in terms of some conserved quantum 
number (or numbers), having opposite values for ‘particle’ and ‘anti-particle’. This con- 
served quantum number must be associated with some symmetry. Now, referring again to 
chapter 2, we recall that electromagnetism is associated with invariance under local U(1) 
phase transformations. Even in the absence of electromagnetism, however, a theory with 
complex fields can exhibit a global U(1) phase invariance. As we shall show in section 7.1, 
such a symmetry indeed leads to the existence of a conserved quantum number, in terms of 
which we can distinguish the particle and anti-particle parts of a complex scalar field. 

In section 7.2 we generalize the complex scalar field to the complex spinor (Dirac) 
field, suitable for charged spin-5 particles. Again we find an analogous conserved quantum 
number, associated with a global U(1) phase invariance of the Lagrangian, which serves to 
distinguish particle from anti-particle. Central to the satisfactory physical interpretation of 
the Dirac field will be the requirement that it must be quantized with anti-commutation 
relations—the famous ‘spin-statistics’ connection. 

The electromagnetic field must then be quantized, and section 7.3.2 describes the con- 
siderable difficulties this poses. With all this in place, we can easily introduce (section 7.4) 
electromagnetic interactions via the ‘gauge principle’ of chapter 2. The resulting Lagrangians 
and Feynman rules will be applied to simple processes in the following chapter. In the final 
section of this chapter, we return to the discrete symmetries of chapter 4, and extend them 
from the single particle theory to quantum field theory. 
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7.1 The complex scalar field: global U(1) phase invariance, 
particles and anti-particles 


Consider a Lagrangian for two free fields ĝi and dbo having the same mass M: 
Ê = 48,,610"d1 — M°? + 40,620"b2 — 4M? G. (7.1) 


We shall see how this is appropriate to a ‘particle—anti-particle’ situation. 

In general ‘particle’ and ‘anti-particle’ are distinguished by having opposite values of one 
or more conserved additive quantum numbers. Since these quantum numbers are conserved, 
the operators corresponding to them commute with the Hamiltonian and are constant in 
time (in the Heisenberg formulation—see equation (5.59)); such operators are called sym- 
metry operators and will be increasingly important in later chapters. For the present we 
consider the simplest case in which ‘particle’ and ‘anti-particle’ are distinguished by having 
opposite eigenvalues of just one symmetry operator. This situation is already realized in the 
simple Lagrangian of (7.1). The symmetry involved is just this: £ of (7.1) is left unchanged 
(is invariant) if ĝı and ¢2 are replaced by ¢/, and $4, where (cf (2.64)) 


Gh = (cos a) $1 — (sin a)ĝz 


f i 7 (72) 
5 = (sina), + (cosa)d2 


where a is a real parameter. This is like a rotation of coordinates about the z-axis of 
ordinary space, but of course it mixes field degrees of freedom, not spatial coordinates. The 
symmetry transformation of (7.2) is sometimes called an ‘O(2) transformation’, referring to 
the two-dimensional rotation group O(2). We can easily check the invariance of Ê, i.e. 


£($4, 64) = L(b1, $2); (7.3) 


see problem 7.1. 

Now let us see what is the conservation law associated with this symmetry. It is sim- 
pler (and sufficient) to consider an infinitesimal rotation characterized by the infinitesimal 
parameter €, for which cose ~ 1 and sine ~ € so that (7.2) becomes 


i = bred 
os (7.4) 
by = b2 + €or 
and we can define changes 50; by 
ee eee 
— : (7.5) 


doo = VA = bo = +e. 


Under this transformation Ê is invariant and so 6£ = 0. But £ is an explicit function of ĝi, 
$2, Ondi, and ð p2. Thus we can write 


‘ OL n aL 7 Ə ~ Ə 
0 = ôL = — (0 + ——— §(0 H —— ô | —>— Odo, 7.6 
Oop” see on a oe 


This is a bit like the manipulations leading up to the derivation of the Euler-Lagrange 
equations in section 5.2.4, but now the changes 6¢; (i = 1,2) have nothing to do with 
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space-time trajectories—they mix up the two fields. However, we can use the equations of 
motion for ġı and ¢2 to rewrite dL as 


o£ > o£ ; 
= ———6(0.¢1) + ——— 6(ðuh2 
TOA eE Fa By (O02) 
aL A al : 
Plea) elea 


Since (ð ĝi) = 0,(6¢;), the right-hand side of (7.7) is just a total divergence, and (7.7) 
becomes . : 
OL » OL 1 
= 0, Poem + g : (7.8) 
að Qı) (0.62) 

These formal steps are actually perfectly general, and will apply whenever a certain La- 
grangian depending on two fields ¢; and ¢2 is invariant under ¢; + ¢; + 6¢;. In the present 
case, with 6; given by (7.5), we have 


Of z Ə , 
= 0 ~— €2 z ED] 
: laa mas 
= €0,[(0"d2)d1 — (O"b1) 42] (7.9) 


where the free-field Lagrangian (7.1) has been used in the second step. Since € is arbitrary, 
we have proved that the 4-vector operator 


Nf = b10"b2 — $20" Qı (7.10) 


is conserved: 
ON, =0. (7.11) 


Such conserved 4-vector operators are called symmetry currents, often denoted generically 
by J“. There is a general theorem (due to Noether (1918) in the classical field case) to the 
effect that if a Lagrangian is invariant under a continuous transformation, then there will 
be an associated symmetry current. We shall consider Noether’s theorem again in volume 
2. 

What does all this have to do with symmetry operators? Written out in full, (7.11) is 


ON$/dt+V- No =0. (7.12) 


Integrating this equation over all space, we obtain 


= Ñ$ Ba + Ñg-daS=0 (7.13) 
V-0o S—0o 

where we have used the divergence theorem in the second term. Normally the fields may be 
assumed to die off sufficiently fast at infinity that the surface integral vanishes (by using 

wave packets, for example), and we can therefore deduce that the quantity Ny is constant 

in time, where 


Ng = pr da (7.14) 


that is, the volume integral of the u = 0 component of a symmetry current is a symmetry 
operator. 
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In order to see how No serves to distinguish ‘particle’ from ‘anti-particle’ in the simple 
example we are considering, it turns out to be convenient to regard ġı and ¢2 as components 
of a single complex field 

b = +5(d1 — ida) 
i o (7.15) 
gl = va (Pi + ide). 


The plane-wave expansions of the form (5.155) for by and Q2 imply that ¢ has the expansion 


A dk ‘: 
= | ——=[a(k)e** + b (kje? 7.16 
b= | em (He (7.16) 
where (k) T ) 
a(k) = —=(a@1 — iâ2 
k VP ie ad (7.17) 
i(k) = (at —ial) 
and w = (M? + k?)!/2, The operators â, ât, b, bt obey the commutation relations 
[a(k), âf (k’)] = (21)?5°(k — k”) 
PORER (7.18) 
[b(k), b (k')] = (27)®8? (k — k’) 
with all others vanishing; this follows from the commutation relations 
[âi (k), â (k’)] = fij (2r) E(k- k') ete (7.19) 


for the â; operators. Note that two distinct mode operators, â and b, are appearing in the 
expansion (7.16) of the complex field. 
In terms of this complex ¢ the Lagrangian of (7.1) becomes 


£L=86,otard — M Ate (7.20) 


and the Hamiltonian is (dropping the zero-point energy, i.e. normally ordering) 


i dk es 5 
H= I (anys âk) + DI (kbl kw. (7.21) 
The O(2) transformation (7.2) becomes a simple phase change 
Q! = eto (7.22) 
which (see comment (iii) of section 2.6) is called a global U(1) phase transformation; plainly 
the Lagrangian (7.20) is invariant under (7.22). The associated symmetry current Nf be- 
comes i ee 
NE = i(gtare — poot) (7.23) 


and the symmetry operator Ng is (see problem 7.2) 
$ Qn E a a = . A 


Note that Ng has been normally ordered in anticipation of our later vacuum definition 
(7.30), so that Nz|0) = 0. 
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We now observe that the Hamiltonian (7.21) involves the sum of the number operators 
for ‘a’ quanta and ‘b’ quanta, whereas No involves the difference of these number operators. 
Put differently, Ng counts +1 for each particle of type ‘a’ and —1 for each of type ‘b’. This 
strongly suggests the interpretation that the b’s are the anti-particles of the a’s: No is the 
conserved symmetry operator whose eigenvalues serve to distinguish them. For a general 
state, the eigenvalue of Ng is the number of a’s minus the number of anti-a’s and it is a 
constant of the motion, as is the total energy, which is the sum of the a energies and anti-a 
energies. 

We have here the simplest form of the particle—anti-particle distinction: only one additive 
conserved quantity is involved. A more complicated example would be the (K+, K7) pair, 
which have opposite values of strangeness and of electric charge. Of course, in our simple 
Lagrangian (7.20) the electromagnetic interaction is absent, and so no electric charge can 
be defined (we shall remedy this later); the complex field ĝ would be suitable (in respect of 
strangeness) for describing the (K°, K?) pair. 

The symmetry operator Ng has a number of further important properties. First of all, 
we have shown that dN,/dt = 0 from the general (Noether) argument, but we ought also 
to check that 

WÑ, H] =0 (7.25) 


as is required for consistency, and expected for a symmetry operator. This is indeed true 
(see problem 7.2(a)). We can also show 


Ñ, ob = —o 
ors (7.26) 
[No 0] = of 
and, by expansion of the exponential (problem 7.2(b)), that 
U(a)gU-\(a) = G = o (7.27) 
with ; _ 
Ü(a) = Àe, (7.28) 


This shows that the unitary operator Û (a) effects finite U(1) rotations. 
Consider now a state |N4) which is an eigenstate of Ng with eigenvalue Ng. What is the 
eigenvalue of Ng for the state o| Nọ)? It is easy to show, using (7.26), that 


NoolNe) = (Ne — 1)6|.No) (7.29) 


so the application of ĝ to a state lowers its No eigenvalue by 1. This is consistent with 
our interpretation that the ¢ field destroys particles ‘a’ via the â piece in (7.16). (This ‘h 
destroys particles’ convention is the reason for choosing ¢ = (¢1 — iĝ2)/ V2 in (7.15), which 
in turn led to the minus sign in the relation (7.26) and to the earlier eigenvalue Ng—1.) That 
ĝ lowers the No eigenvalue by 1 is also consistent with the interpretation that the same field 
¢ creates an anti-particle via the bt piece in (7.16). In the same way, by considering ¢*| Ny), 
one easily verifies that ¢! increases Nọ by 1, by creating a particle via ât or destroying an 
anti-particle via 6. The vacuum state (no particles and no anti-particles present) is defined 
by 

a(k)|0) = b(k)|0) =0 for all k. (7.30) 


As anticipated, therefore, the complex field b contains two distinct kinds of mode operator, 
one having to do with particles (with positive Ny), the other with anti-particles (negative 
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FIGURE 7.1 
(a) For tı > t2, a @ particle (Ny = 1) propagates from x2 to x; (b) for t2 > tı an anti-¢ 
particle (Ny = —1) propagates from zı to x2. 


Ng). Which we choose to call ‘particle’ and which ‘anti-particle’ is of course purely a matter 
of convention: after all, the negatively charged electron is always regarded as the ‘particle’, 
while in the case of the pions we call the positively charged 7* the particle. 

Feynman rules for theories involving complex scalar fields may be derived by a straight- 
forward extension of the procedure explained in chapter 6. It is, however, worth pausing 
over the propagator. The only non-vanishing vev of the time-ordered product of two ¢ fields 
is (0/T((x1)41 (x2))|0} (the vev’s of T(ġġ) and T(¢'¢+) vanish with the vacuum defined 
as in (7.30)). In section 6.3.2 we gave a pictorial interpretation of the propagator for a real 
scalar field; let us now consider the analogous pictures for the complex field. For tı > tz the 
time-ordered product is ¢(«1)¢'(a2); using the expansion (7.16) and the vacuum conditions 
(7.30), the only surviving term in the vev is that in which an ‘a!’ creates a particle (Ng = 1) 
at (ata, t2) and an ‘a’ destroys it at (x1, t1); the ‘b’ operators in (22)! give zero when acting 
on |0), as do the ‘bt’ operators in ġt (x1) when acting on (0|. Thus for tı > tz we have the 
pictorial interpretation of figure 7.1(a). For tg > tı, however, the time-ordered product is 
ĝt (x2)Q(x1). Here the surviving vev comes from the ‘b!’ in ¢(a1) creating an anti-particle 
(Ng = —1) at xı, which is then annihilated by the ‘b’ in ¢t(a2). This tz > tı process 
is shown in figure 7.1(b). The inclusion of both processes shown in figure 7.1 makes sense 
physically, following considerations similar to those put forward ‘intuitively’ in section 3.5.4: 
the process of figure 7.1(a) creates (say) a positive unit of Ny at x2 and loses a positive 
unit at x1, while another way of effecting the same ‘Ng transfer’ is to create an anti-particle 
of unit negative Nọ at zı, and propagate it to x2 where it is destroyed, as in figure 7.1(b). 
It is important to be absolutely clear that the Feynman propagator (0|T(4(x1)¢" (22))|0) 
includes both the processes in figures 7.1(a) and (b). 

In practice, as we found in section 6.3.2, we want the momentum-space version of 
the propagator, i.e. its Fourier transform. As we also noted there (cf also appendix G), 
the propagator is a Green function for the KG operator (O + m?) with mass parameter 
m; in momentum-space this is just the inverse, (—k? + m?)~1. In the present case, since 
both ¢ and ot obey the same KG equation, with mass parameter M, we expect that the 
momentum-space version of (0|T'(4(x1)¢'(«2))|0) is also 


i 


k2? — M? + ie ey 
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(a) (b) 


FIGURE 7.2 
Equivalent Feynman graphs for single W-exchange in ve + e7 > Ve +e. 


This can be verified by inserting the expansion (7.16) into the vev of the T-product, and 
following the steps used in section 6.3.2 for the scalar case. 

In this (momentum-space) version, it is the ‘ie’ which keeps track of the ‘particles going 
from 2 to 1 if t, > ta’ and ‘anti-particles going from 1 to 2 if tg > tı’ (recall its appearance in 
the representation (6.93) of the all-important 0-function). As in the scalar case, momentum- 
space propagators in Feynman diagrams carry no implied order of emission/absorption 
process; both the processes in figure 7.1 are always included in all propagators. Arrows 
showing ‘momentum flow’ now also show the flow of all conserved quantum numbers. Thus 
the process shown in figure 7.2(a) can equally well be represented as in figure 7.2(b). 

There is one more bit of physics to be gleaned from (0|T'(4(21)"(a2))|0). As in the real 
scalar field case, the vanishing of the commutator at space-like separations 


[d(a1), ĝt (a2)}=0 for (zı — z2)? <0 (7.32) 


guarantees the Lorentz invariance of the propagator for the complex scalar field and of the 
S-matrix. But in this (complex) case, there is a further twist to the story. Evaluation of 
[6(a1), 6! (£2)] reveals (problem 7.3) that, in the region (£1 —£2)? < 0, the commutator is the 
difference of two functions (not field operators), one of which arises from the propagation of 
a particle from x2 to z1, the other of which comes from the propagation of an anti-particle 
from zı to x2 (just as in figure 7.1). Both processes must exist for this difference to be 
zero, and furthermore for cancellations between them to occur in the space-like region the 
masses of the particle and anti-particle must be identical. In quantum field theory, therefore, 
‘causality’ (in the sense of condition (7.32)—cf (6.82)) requires that every particle has to 
have a corresponding anti-particle with the same mass and opposite quantum numbers. As 
we saw in chapter 4, these requirements are guaranteed by the CPT theorm, which is a 
consequence of very general principles of quantum field theory. 
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7.2 The Dirac field and the spin-statistics connection 


I remember that when someone had tried to teach me about creation and annihilation 
operators, that this operator creates an electron, I said ‘how do you create an electron? 
It disagrees with the conservation of charge’, and in that way I blocked my mind from 
learning a very practical scheme of calculation. 


[From the lecture delivered by Richard Feynman in Stockholm, Sweden, on 11 December 
1965, when he received the Nobel Prize in physics, which he shared with Sin-itiro Tomonaga 
and Julian Schwinger. (Feynman 1966).] 


We now turn to the problem of setting up a quantum field which, in its wave aspects, sat- 
isfies the Dirac equation (cf comment (5) in section 5.2.5), and in its ‘particle’ aspects creates 
or annihilates fermions and anti-fermions. Following the ‘Heisenberg—Lagrange—Hamilton’ 
approach of section 5.2.5, we begin by writing down the Lagrangian which, via the corre- 
sponding Euler-Lagrange equation, produces the Dirac equation as the ‘field equation’. The 
answer (see problem 7.4) is 


p =ivtd +ivta- Vy — mut By. (7.33) 
The relativistic invariance of this is more evident in y-matrix notation (problem 4.3): 
Lp = plig" ð, — m)v. (7.34) 


We can now attempt to ‘quantize’ the field y by making a mode expansion in terms of 
plane-wave solutions of the Dirac equation, in a fashion similar to that for the complex 
scalar field in (7.16). We obtain (see problem 3.8 for the definition of the spinors u and v, 
and the attendant normalization choice) 


= ea Qn mE dl u(k, sje" + di(k)u(k, sje"), (7.35) 


where w = (m? + k”)'/?. We wish to interpret ĉt(k) as the creation operator for a Dirac 
particle of spin s and momentum k. By analogy with (7.16), we expect that d! (k) creates 
the corresponding anti-particle. Presumably we must define the vacuum by (cf (7.30)) 


é.(k)|0) =d.(k)|0) =0 for all k and s = 1,2. (7.36) 
A two-fermion state is then 
lki, $1; k2, 82) x at (ke a (k2)|0). (7.37) 


But it is here that there must be a difference from the boson case. We require a state 
containing two identical fermions to be antisymmetric under the exchange of state labels 
ky © k2, s1 © s2, and thus to be forbidden if the two sets of quantum numbers are the same, 
in accordance with the Pauli exclusion principle, responsible for so many well-established 
features of the structure of matter. 

The solution to this dilemma is simple but radical: for fermions, commutation relations 
are replaced by anti-commutation relations! The anti-commutator of two operators A and 
B is written: 

{A,B} = AB+BA. (7.38) 
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If two different ¢’s anti-commute, then 
so that we have the desired antisymmetry 
la, $1; ka, 82) = —|hta, $23 ka, 81). (7.40) 
In general we postulate 
ees (kı), él, (k2)} = (20)?53 (key = k2)ðs; s2 
{ês (ki), êsa (k2)} = {2}, (k1), ê}, (k2)} = 0 
and similarly for the d’s and dt’s. The factor in front of the 6-function depends on the 
convention for normalizing Dirac wavefunctions. 
We must at once emphasize that in taking this ‘replace commutators by anti- 
commutators’ step we now depart decisively from the intuitive, quasi-mechanical, picture 
of a quantum field given in chapter 5, namely as a system of quantized harmonic oscilla- 


tors. Of course, the field expansion (7.35) is a linear superposition of ‘modes’ (plane-wave 
solutions), as for the complex scalar field in (7.16) for example; but the ‘mode operators’ 


(7.41) 


ĉs and di are fermionic (obeying anti-commutation relations) not bosonic (obeying com- 
mutation relations). As mentioned at the end of section 5.1, it does not seem possible to 
provide any mechanical model of a system (in three dimensions) whose normal vibrations 
are fermionic. Correspondingly, there is no concept of a ‘classical electron field’, analogous 
to the classical electromagnetic field (which doubtless explains why we tend to think of 
fermions as basically ‘more particle-like’). However, we can certainly recover a quantum 
mechanical wavefunction from (7.35) by considering, as in comment (5) of section 5.4, the 
vacuum-to-one-particle matrix element (0|¢)(a, t)|k1, 51). 

In the bosonic case, we arrived at the commutation relations (5.130) for the mode op- 
erators by postulating the ‘fundamental commutator of quantum field theory’, equation 
(5.117), which was an extension to fields of the canonical commutation relations of quan- 
tum (particle) mechanics. For fermions, we have simply introduced the anti-commutation 
relations (7.41) ‘by hand’, so as to satisfy the Pauli principle. We may ask: What then be- 
comes of the analogous ‘fundamental commutator’ in the fermionic case? A plausible guess 
is that, as with the mode operators, the ‘fundamental commutator’ is to be replaced by a 
‘fundamental anti-commutator’, between the fermionic field q and its ‘canonically conjugate 


momentum field’ 7p, of the form: 


{V(x t), @(y,t)} = iô(x — y). (7.42) 
As far as 7p is concerned, we may suppose that its definition is formally analogous to 
(5.122), which would yield 
aL - 
ftp = —— = iqi. (7.43) 
OY 
We must also not forget that both o and 7p are four-component objects, carrying spinor 
indices. Thus we are led to expect the result 


{balz t), bf(y,t)} = (@ — y)bag, (7.44) 


where a and ĝ are spinor indices. It is a good exercise to check, using (7.41), that this is 
indeed the case (problem 7.5). We also find 


{V(x t), Vly, t)} = {4t (x,t), oy, t)} =0. (7.45) 
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In this (anti-commutator) sense, then, we have a ‘canonical’ formalism for fermions. 
The Dirac Hamiltonian density is then (cf (5.123)) 


Pat te fain emi B (7.46) 


using (7.43) and (7.33), and the Hamiltonian is 
Hp = [file - iV + mýt By] dg. (7.47) 


One may well wonder why things have to be this way—‘bosons commute, fermions anti- 
commute’. To gain further insight, we turn again to a consideration of symmetries and the 
question of particle and anti-particle—this time for the Dirac field, rather than the Dirac 
wavefunction discussed in chapter 4. 

The Dirac field w is a complex field, as is reflected in the two distinct mode operators in 
the expansion (7.35); as in the complex scalar field case, there is only one mass parameter 
and we expect the quanta to be interpretable as particle and anti-particle. The symme- 
try operator which distinguishes them is found by analogy with the complex scalar field 
case. We note that Ĉp (the quantized version of (7.34)) is invariant under the global U(1) 
transformation j 

pow = e) (7.48) 
which is 


pow =o-ieb (7.49) 


in infinitesimal form. The corresponding (Noether) symmetry current can be calculated as 
Ny = b (7.50) 


and the associated symmetry operator is 
Ny = fiiia. (7.51) 


Ñy is clearly a number operator for the fermion case. As for the complex scalar field, invari- 
ance under a global U(1) phase transformation is associated with a number conservation 
law. 

Inserting the plane-wave expansion (7.35), we obtain, after some effort (problem 7.6), 


a d°k 
Ny = | > ĉl (k)ês(k) + ås (k)dt (k 52 
s= | Ge DAWA) +404), (7.52) 
Similarly, the Dirac Hamiltonian may be shown to have the form (problem 7.6) 


E 3 x 7 
Hp = / ay SE [2h (k)ês (k) — d(e) al (kw. (7.53) 


s=1,2 


It is important to state that in obtaining (7.52) and (7.53), we have not assumed either 
commutation or anti-commutation relations for the mode operators é, é, d, and dt, only 
properties of the Dirac spinors; in particular, neither (7.52) nor (7.53) has been normally 
ordered. Suppose now that we assume commutation relations, so as to rewrite the last terms 
in (7.52) and (7.53) in normally ordered form as dt(k)d, (k). We see that Hp will then 


contain the difference of two number operators for ‘c’ and ‘d’ particles, and is therefore not 
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positive-definite as we require for a sensible theory. Moreover, we suspect that, as in the db 
case, the ‘d’s’ ought to be the anti-particles of the ‘c’s’, carrying opposite Ny value: but Ny 
is then (with the previous assumption about commutation relations) just proportional to 
the sum of ‘c’ and ‘d’ number operators, counting +1 for each type, which does not fit this 
interpretation. However, if anti-commutation relations are assumed, both these problems 
disappear: dropping the usual infinite terms, we obtain the normally ordered forms 


A 3 7 7 
W= [ape E BOLE) - ed.) (7.54) 


E 3 Bi M 2 
Hp = J S 2 E Eo (7.55) 


which are satisfactory, and allow us to interpret the ‘d’ quanta as the anti-particles of the 
‘c quanta. Similar difficulties would have occurred in the complex scalar field case if we had 
assumed anti-commutation relations for the boson operators, and the ‘causality’ discussion 
at the end of the preceding section would not have worked either (instead of a difference of 
terms we would have had a sum). It is in this way that quantum field theory enforces the 
connection between spin and statistics. 

Our discussion here is only a part of a more general approach leading to the same 
conclusion, first given by Pauli (1940); see also Streater et al. (1964). 

As in the complex scalar case, the other crucial ingredient we need is the Dirac propa- 
gator (OJT (ģ(x1)ý(x2))|0}. We shall see in section 7.4 why it is 7) here rather than t—the 
reason is essentially to do with Lorentz covariance (see section 4.1.2). Because the o fields 
are anti-commuting, the T-symbol now has to be understood as 


T (h(a1)b(a2)) 


b(ai)—(t2) for ti > ta (7.56) 
—W(a2)b(a1) for ty > th. (7.57) 


Once again, this propagator is proportional to a Green function, this time for the Dirac 
equation, of course. Using 7-matrix notation (problem 4.3) the Dirac equation is (cf (7.34)) 


(iy“0,, — m)b = 0. (7.58) 


The momentum-space version of the propagator is proportional to the inverse of the oper- 
ator in (7.58), when written in k-space, namely to (K — m)~1 where 


k= yky (7.59) 


is an important shorthand notation (pronounced ‘k-slash’). In fact, the Feynman propagator 
for Dirac fields is . 
i 

K- m++ie 

As in (7.31), the ie takes care of the particle/anti-particle, emission/absorption business. 

Formula (7.60) is the fermion analogue of ‘rule (ii)’ in (6.103). 
The reader should note carefully one very important difference between (7.60) and (7.31), 
which is that (7.60) is a 4x4 matrix. What we are really saying (cf (6.98)) is that the Fourier 


transform of (O|T (ha(x1)b, (%2))|0), where œ and 8 run over the four components of the 


Dirac field, is equal to the (a, 8) matrix element of the matrix i(f — m + ie)7?: 


(7.60) 


fi — x2) eik- (1-22) (OT ($a (21)ý a (22))10) = i(f — m+ ie)z5- (7.61) 
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The form (7.61) can be made to look more like (7.31) by making use of the result (prob- 
lem 7.7) 
(f — m)(K +m) = (k? — m’) (7.62) 


(where the 4x4 unit matrix is understood on the right-hand side) so as to write (7.61) as 


i(k +m) 


wir (7.63) 


As in the scalar case, (7.61) can be directly verified by inserting the field expansion (7.35) 
into the left-hand side, and following steps analogous to those in equations (6.92)—(6.98). In 
following this through one will meet the expressions }>, u(k, s)u(k, s) and `, u(k, s)0(k, s), 
which are also 4 x 4 matrices. Problem 7.8 shows that these quantities are given by 


do talk, 8)ta(k,s)= (K+ mas  J_ valk, s)Ba(k, 8) = (# - m)ap. (7.64) 


With these results, and remembering the minus sign in (7.57), one can check (7.63) (prob- 
lem 7.9). 

One might now worry that the adoption of anti-commutation relations for Dirac fields 
might spoil ‘causality’, in the sense of the discussion after (7.32). One finds, indeed, that 
the fields w and w anti-commute at space-like separation, but this is enough to preserve 
causality for physical observables, which will involve an even number of fermionic fields. 

We now turn to the problem of quantizing the Maxwell (electromagnetic) field. 


7.3 The Maxwell field A” (<x) 
7.3.1 The classical field case 


Following the now familiar procedure, our first task is to find the classical field Lagrangian 
which, via the corresponding Euler-Lagrangian equations, yields the Maxwell equation for 
the electromagnetic potential A”, namely (cf (2.22)) 


A” — o” (0,A") = fem: (7.65) 
The answer is (see problem 7.10) 
Lam = i Ep FO aAa (7.66) 
where Fuy = 0, Ay — 0, Ay. So the pure A-field part is the Maxwell Lagrangian 
ae -Fw Fe, (7.67) 
Before proceeding to try to quantize (7.67), we need to understand some important aspects 


of the free classical field A” (x). 
When jem is set equal to zero, A” satisfies the equation 


Ə, F!” = DA” — 8"("A,) = 0. (7.68) 


As we have seen in section 2.3, these equations are left unchanged if we perform the gauge 
transformation 


A! > Al! = A" — By, (7.69) 
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We can use this freedom to choose the A” with which we work to satisfy the condition 


ð, A" = 0. (7.70) 


This is called the Lorentz condition. The process of choosing a particular condition on A“ 
so as to define it (ultimately) uniquely is called ‘choosing a gauge’; actually the condition 
(7.70) does not yet define A“ uniquely, as we shall see shortly. The Lorentz condition 
is a very convenient one, since it decouples the different components of A“ in Maxwell’s 
equations (7.68)—in a covariant way, moreover, leaving the very simple equation 


A” = 0. (7.71) 
This has plane-wave solutions of the form 
Al = Nee ihe (7.72) 


with k? = 0 (i.e. k? = k?), where N is a normalization factor and e” is a polarization vector 
for the wave. The gauge condition (7.70) now reduces to a condition on e“: 


k-e=0. (7.73) 


However, we have not yet exhausted all the gauge freedom. We are still free to make another 
shift in the potential 
A” — A" — OFX (7.74) 


provided ¥ satisfies the massless KG equation 


Y=0. (7.75) 


This condition on y ensures that, even after the further shift, the resulting potential still 
satisfies 0, A” = 0. For our plane-wave solutions, this residual gauge freedom corresponds 
to changing e” by a multiple of k”: 


eH — e 4 Bk = e (7.76) 


which still satisfies e” -k = 0 since k? = 0 for these free-field solutions. The condition k? = 0 
is, of course, the statement that a free photon is massless. 
This freedom has important consequences. Consider a solution with 


kt = (k? k) (KPP =? (7.77) 
and polarization vector 
e = (ee) (7.78) 


satisfying the Lorentz condition 
k-e=0. (7.79) 


Gauge invariance now implies that we can add multiples of k” to e” and still have a satis- 
factory polarization vector. 

It is therefore clear that we can arrange for the time component of e” to vanish so that 
the Lorentz condition reduces to the 3-vector condition 


k-e=0. (7.80) 


1 This is the common, but incorrect, name. The condition was first introduced by Ludwig Lorenz (Lorenz 
1867), and only later given by H. A. Lorentz (Lorentz 1892). 
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This means that there are only two independent polarization vectors, both transverse to k, 
i.e. to the propagation direction. For a wave travelling in the z-direction (k“ = (k°,0,0,k°)), 
these may be chosen to be 
ea) = (1,0, 0) (7.81) 
E(2) = (0, 1, 0). (7.82) 


Such a choice corresponds to linear polarization of the associated E and B fields—which 
can be easily calculated from (2.10) and (2.11), given 


Aly =NO ewe" i=1,2. (7.83) 


A commonly used alternative choice is 

(1, i, 0) (7.84) 
e(A = —1) = — (1, -i, 0) (7.85) 

(linear combinations of (7.81) and (7.82)), which correspond to circularly polarized radia- 

tion. The phase convention in (7.84) and (7.85) is the standard one in quantum mechanics 


for states of definite spin projection (‘helicity’) A = +1 along the direction of motion (the 
z-axis here). We may easily check that 


E (A) -€(X’) = 5" (7.86) 


or, in terms of the corresponding 4-vectors e” = (0, €) 


A= ôw. (7.87) 


We have therefore arrived at the result, familiar in classical electromagnetic theory, that 
the free electromagnetic fields are purely transverse. Though they are described in this 
formalism by a vector potential with apparently four independent components (V, A), the 
condition (7.70) reduces this number by one, and the further gauge freedom exploited in 
(7.74)-(7.76) reduces it by one more. 

A crucial point to note is that the reduction to only two independent field components 
(polarization states) can be traced back to the fact that the free photon is massless: see the 
remark after (7.76). By contrast, for massive spin-1 bosons, such as the WF and Z°, all three 
expected polarization states are indeed present. However, weak interactions are described 
by a gauge theory, and the WË and Z° particles are gauge-field quanta, analogous to the 
photon. How gauge invariance can be reconciled with the existence of massive gauge quanta 
with three polarization states will be explained in volume 2. 

We may therefore write the plane-wave mode expansion for the classical A” (x) field in 
the form 


d°k —ik-x p* * ik-x 
was | gym El ENa de ke 4 eH (k, Aja” (k, Nee] (7.88) 


where the sum is over the two possible polarization states À, for given k, as described by 
the suitable polarization vector e” (k, A) and w = |k]. 

It would seem that all we have to do now, in order to ‘quantize’ (7.88), is to promote a 
and a* to operators â and 4’, as usual. However, things are actually not nearly so simple. 
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7.3.2 Quantizing A” (x) 


Readers familiar with Lagrangian mechanics may already suspect that quantizing A” is 
not going to be straightforward. The problem is that, clearly, A” (x) has four (Lorentz) 
components—but, equally clearly in view of the previous section, they are not all indepen- 
dent field components or field degrees of freedom. In fact, there are only two independent 
degrees of freedom, both transverse. Thus there are constraints on the four fields, for in- 
stance the gauge condition (7.70). Constrained systems are often awkward to handle in 
classical mechanics (see for example Goldstein 1980) or classical field theory; and they 
present major problems when it comes to canonical quantization. It is actually at just this 
point that the ‘path-integral’ approach to quantization, alluded to briefly at the end of sec- 
tion 5.2.2, comes into its own. This is basically because it does not involve non-commuting 
(or anti-commuting) operators and it is therefore to that extent closer to the classical case. 
This means that the relatively straightforward procedures available for constrained classical 
mechanics systems can—when suitably generalized!—hbe efficiently brought to bear on the 
quantum problem. For an introduction to these ideas, we refer to Swanson (1992). 

However, we do not wish at this stage to take what would be a very long detour, in 
setting up the path-integral quantization of QED. We shall continue along the ‘canonical’ 
route. To see the kind of problems we encounter, let us try and repeat for the A” field the 
‘canonical’ procedure we introduced in section 5.2.5. This was based, crucially, on obtaining 
from the Lagrangian the momentum 7 conjugate to ¢, and then imposing the commutation 
relation (5.117) on the corresponding operators 7 and ĝ. But inspection of our Maxwell 
Lagrangian (7.67) quickly reveals that 


OFA 5% (7.89) 
aAo 


0 conjugate to A°. We appear to be stymied 


and hence there is no canonical momentum 7 
before we can even start. 

There is another problem as well. Following the procedure explained in chapter 6, we 
expect that the Feynman propagator for the A“ field, namely (0|T(A“(«1)A”(a2))|0), will 
surely appear, describing the propagation of a photon between x; and 22. In the case of 
real scalar fields, problem 6.3 showed that the analogous quantity was actually a Green 
function for the KG differential operator, (O + m°). It turned out, in that case, that what 
we really wanted was the Fourier transform of the Green function, which was essentially 
(apart from the tricky ‘ie prescription’ and a trivial —i factor) the inverse of the momentum- 
space operator corresponding to (O + m?), namely (—k? + m?)~+ (see equation (6.98) and 
appendix G, and also (7.58)—(7.60) for the Dirac case). Suppose, then, that we try to follow 
this route to obtaining the propagator for the A” field. For this it is sufficient to consider 
the classical equations (7.68) with jem = 0, written in k space (problem 7.11(a)): 


(—k?g”" + k’k#)A,(k) = M’#A,(k) =0 (7.90) 
where A,,(k) is the Fourier transform of A„(x). We therefore require the inverse 
(=k? gt +k ke) + = (MTH. (7.91) 


Unfortunately it is easy to show that this inverse does not exist. From Lorentz covariance, 
it has to transform as a second-rank tensor, and the only ones available are g”” and k”k”. 
So the general form of (M~')”“ must be 


(MHYH = A(k°)g”” + B(k?)kY k”. (7.92) 
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Now the inverse is defined by 
(MM po = ga- (7.93) 
Putting (7.92) and (7.90) into (7.93) yields (problem 7.11(b)) 
—k? A(k?) 9% + A(k*)k’ ko = g” (7.94) 


which cannot be satisfied. So we are thwarted again. 
Nothing daunted, the attentive reader may have an answer ready for the propagator 
problem. Suppose that, instead of (7.68), we start from the much simpler equation 


AY =0 (7.95) 


which results from imposing the Lorentz condition (7.70). Then, in momentum-space, (7.95) 
becomes 7 
-KRA =0. (7.96) 


The ‘—k?’ on the left-hand side certainly has an inverse, implying that the Feynman prop- 
agator for the photon is (proportional to) g,,/k?. This form is indeed plausible, as it is 
very much what we would expect by taking the massless limit of the spin-0 propagator and 
tacking on g,,, to account for the Lorentz indices in (0|T(A, (21) A, (x2))|0) (but then why 
no term in k,,k,?—see the final two paragraphs of this section!). 

Perhaps this approach helps with the ‘no canonical momentum 7°’ problem too. Let us 
ask: What Lagrangian leads to the field equation (7.95)? The answer is (problem 7.12) 


1 
Lr = — Fw FY — A0, A). (7.97) 


This form does seem to offer better prospects for quantization, since at least all our 7“’s 
are non-zero; in particular 


Pa = = —9,A". (7.98) 


The other 7’s are unchanged by the addition of the extra term in (7.97) and are given by 
ni = -Å +A. (7.99) 


Interestingly, these are precisely the electric fields E* (see (2.10)). Let us see, then, if all 
our problems are solved with £z. 

Now that we have at least got four non-zero 7“’s, we can write down a plausible set of 
commutation relations between the corresponding operator quantities 7” and AY: 


[Au (x,t), #.(y, t)] = igu ð’ (Œ — y). (7.100) 


Again, the g,,, is there to give the same Lorentz transformation character on both sides of 
the equation. But we must now remember that, in the classical case, our development rested 
on imposing the condition 0,,A” = 0 (7.70). Can we, in the quantum version we are trying 
to construct, simply impose ð, AM = 0? We certainly cannot do so in Ê L, or we are back to 
La again (besides, constraints cannot be ‘substituted back’ into Lagrangians, in general). 
Furthermore, if we set u = v = 0 in (7.100), then the right-hand side is non-zero while the 
left-hand side is zero if ð Âr =0=7°. So it is inconsistent simply to set ð Âr =0, 

We will return to the treatment of ‘9, Au = 0’ eventually. First, let us press on with 


(7.97) and see if we can get as far as a (quantized) mode expansion, of the form (7.88), for 
A" (ax). 
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To set this up, we need to massage the commutator (7.100) into a form as close as 
possible to the canonical ‘{¢, 4] = id’ form. Assuming the other commutation relations (cf 
(5.118)) i i 

[Â (x,t), Ân (y, t)] = [âp (æ, t), âu (y, t] =0 (7.101) 


we see that the spatial derivatives of the A’s commute with the A’s, and with each other, 
at equal times. This implies that we can rewrite the (quantum) 7’s as 


i, = — Â, + pieces that commute. (7.102) 


Hence (7.100) can be rewritten as 


[Â (æ, t), Av(y,t)] = -igu ð? (2 — y) (7.103) 


and (7.101) remains the same. Now (7.103) is indeed very much the same as ‘[@, ¢] = id’ for 
the spatial component A'—but the sign is wrong in the u = v = 0 case. We are not out of 
the maze yet. 

Nevertheless, proceeding onwards on the basis of (7.103), we write the quantum mode 
expansion as (cf (7.88)) 


> | moe = sa le (k, A)Ga(k)e** + e*t (k, AJâ} (kel) (7.104) 


where the sum is over four independent polarization states À = 0, 1, 2, 3, since all four fields 
are still in play. Before continuing, we need to say more about these €’s (previously, we 
only had two of them, now we have four and they are 4-vectors). We take k to be along 
the z-direction as in our discussion of the e’s in section 7.3.1, and choose two transverse 
polarization vectors as (cf (7.81), (7.82)) 


et (k, \ = 1) = (0,1,0,0) 


‘transverse polarizations’. (7.105) 
(k= 2) = (0,0,1,0) 
The other two e’s are 
e” (k, A = 0) = (1,0, 0,0) ‘time-like polarization’ (7.106) 
and 
e” (k, A = 3) = (0,0, 0, 1) ‘longitudinal polarization’. (7.107) 


Making (7.104) consistent with (7.103) then requires 
[aa(k), â (k’)] = -gax (27)38 (k — k’). (7.108) 
This is where the wrong sign in (7.103) has come back to haunt us: we have the wrong sign 
in (7.108) for the case \ = A’ = 0 (time-like modes). 
What is the consequence of this? It seems natural to assume that the vacuum is defined 
by 
&(k)|0) = 0 for all A = 0,1, 2,3. (7.109) 


But suppose we use (7.108) and (7.109) to calculate the normalization overlap of a ‘one 
time-like photon’ state; this is 


(k’, A = O|k, \ = 0) (0lâo(k)â} (k’)|0) 
— (27)? (k — k’) (7.110) 


168 Quantum Field Theory III 


and the state effectively has a negative norm (the k = k’ infinity is the standard plane-wave 
artefact). Such states would threaten fundamental properties such as the conservation of 
total probability if they contributed, uncancelled, in physical processes. 

At this point we would do well to recall the condition ‘a, AM = 0’, which still needs to be 
taken into account, somehow, and it does indeed save us. Gupta (1950) and Bleuler (1950) 
proposed that, rather than trying (unsuccessfully) to impose it as an operator condition, 
one should replace it by the weaker condition 


ð AX) (2) |b) = 0 (7.111) 


where the (+) signifies the positive frequency part of A, i.e. the part involving annihilation 
operators, and |W) is any physical state (including |0)). From (7.111) and its Hermitian 
conjugate i 

(0, Â" (x) = 0 (7.112) 


we can deduce that the Lorentz condition (7.70) does hold for all expectation values: 
(DO, AY |v) = (Ua, AX + 8, Ar |B) = 0, (7.113) 
and so the classical limit of this quantization procedure will recover the classical Maxwell 
theory in Lorentz gauge. 
Using (7.104), (7.106), and (7.107) with k” = (|k|, 0,0, |k|), condition (7.111) becomes 
[do (k) — âs(k)]|¥) = 0. (7.114) 


To see the effect of this condition, consider the expression for the Hamiltonian of this theory. 
In normally ordered form, it turns out to be 


A d°k 
A = | Era lalan + ahaa + áfás — Ado) (7.115) 


so the contribution from the time-like modes looks dangerously negative. However, for any 
physical state |Y}, we have 


(U\(aha3 — abao)|V) = (U|(aha3 — abao)|¥) 
= (|ah (âs — do) |v) 
= 0, (7.116) 


so that only the transverse modes survive. 

We hope that by now the reader will have at least begun to develop a healthy respect for 
quantum gauge fields—and the non-Abelian versions in volume 2 are even worse! The fact 
is that the canonical approach has a difficult time coping with these constrained systems. 
Indeed, the complete Feynman rules in the non-Abelian case were found by an alternative 
quantization procedure (‘path integral’ quantization). This, however, is outside the scope of 
the present volume. The important points for our purposes are as follows. It is possible to 
carry out a consistent quantization in the Gupta—Bleuler formalism, which is the quantum 
version of the Maxwell theory constrained by the Lorentz condition. The propagator for the 
photon in this theory is 

—ig'” /k? + ie (7.117) 


which is the expected massless limit of the KG propagator as far as the spatial components 
are concerned (the time-like component has that negative sign). 
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As in all the other cases we have dealt with so far, the Feynman propagator (0|T(A“(21) 
A’ (a ))|0) can be evaluated using the expansion (7.104) and the commutation relations 
(7.108). One finds that it is indeed equal to the Fourier transform of —ig"”/k? + ie just 
as asserted in (7.117). For this result, we need the ‘pseudo completeness relation’ (problem 
7.13) 


—c(k, A = 0)e (k, A = 0) + e (k, A = 1)” (k, A = 1) 
+e (k, A = 2)e (k, A = 2) + e (k, A = 3)” (k, A = 3) = — gh”. 
(7.118) 


We call this a pseudo completeness relation because of the minus sign appearing in the first 
term: its origin in the evaluation of this vev is precisely the ‘wrong sign commutator’ for 
the âo mode, (7.108). 

Thus the gauge choice (7.70) can be made to work in quantum field theory via the 
condition (7.111). But other choices are possible too. In particular, a useful generalization 
of the Lagrangian (7.97) is 

1 


fe== Fp FH — 


a (3, A)’ (7.119) 


1 
2§ 
where € is a constant, the ‘gauge parameter’. Lẹ leads to the equation of motion (prob- 
lem 7.14) 


1 
( Juv — OO, + 72,0) A =0. (7.120) 
In momentum-space this becomes (problem 7.14) 
2 1 Av 
dy + kuku — zkuky ) AY = 0. (7.121) 


The inverse of the matrix acting on AY exists, and gives us the more general photon prop- 
agator (or Green function) 


i[—gh* + (L— E)k“k” /k?] 
k2 + ie 


(7.122) 


as shown in problem 7.14. The previous case is recovered as é — 1. Confusingly, the choice 
€ = 1 is often called the ‘Feynman gauge’, though in classical terms it corresponds to the 
Lorentz gauge choice. For some purposes the ‘Landau gauge’ € = 0 (which is well defined in 
(7.122)) is convenient. In any event, it is important to be clear that the photon propagator 
depends on the choice of gauge. Formula (7.122) is the photon analogue of ‘rule (ii)’ in 
(6.103). 

This may seem to imply that when we use the photon propagator (7.122) in Feynman 
amplitudes we will not get a definite answer, but rather one that depends on the arbi- 
trary parameter €. This is a serious worry. But the propagator is not by itself a physical 
quantity—it is only one part of a physical amplitude. In the following chapter we shall 
derive the amplitudes for some simple processes in scalar and spinor electrodynamics, and 
one can verify that they are gauge invariant—either in the sense (for external photons) of 
being invariant under the replacement (7.76), or (in the case of internal photons) of being 
independent of €. It can be shown (Weinberg 1995, section 10.5) that at a given order in 
perturbation theory the sum of all diagrams contributing to the S-matrix is gauge invariant. 


170 Quantum Field Theory III 


7.4 Introduction of electromagnetic interactions 


After all these preliminaries, the job of introducing the first of our gauge field interactions, 
namely electromagnetism, into our non-interacting theory of complex scalar fields, and of 
Dirac fields, is very easy. From our discussion in chapter 2, we have a strong indication of 
how to introduce electromagnetic interactions into our theories. The ‘gauge principle’ in 
quantum mechanics consisted in elevating a global (space-time-independent) U(1) phase 
invariance into a local (space-time-dependent) U(1) invariance—the compensating fields 
being then identified with the electromagnetic ones. In quantum field theory, exactly the 
same principle exists and leads to the form of the electromagnetic interactions. Indeed, 
in the field theory formalism we have a true local U(1) phase (gauge) invariance of the 
Lagrangian (rather than a gauge covariance of a wave equation) and we shall be able to 
exhibit explicitly the symmetry current, and symmetry operator, associated with the U(1) 
invariance—and identify them precisely with the electromagnetic current and charge. 

We have seen that for both the complex scalar and the Dirac fields the free Lagrangian 
is invariant under U(1) transformations (see (7.22) and (7.48)) which, we once again em- 
phasize, are global. Let us therefore promote these global invariances into local ones in the 
way learned in chapter 2—namely by invoking the ‘gauge principle’ replacement 

o! — D" = ah + ig AY (7.123) 
for a particle of charge q, this time written in terms of the quantum field A". In the case of 
the Dirac Lagrangian 


Lp = Win"d, — m) (7.124) 


we expect to be able to ‘promote’ it to one which is invariant under the local U(1) phase 
transformation? 


(x,t) > (a, t) = RED h(x, t) (7.125) 


provided we make the replacement (7.123) and demand that the (quantized) 4-vector po- 
tential transforms as (cf (2.15) with the sign change for ¥) 


At — Al® = Â! + Arg, (7.126) 


Thus the locally U(1)-invariant Dirac Lagrangian is expected to be 


Lp local = Win" D, ae mý. (7.127) 


The invariance of (7.127) under (7.125) is easy to check, using the crucial property (2.43), 
which clearly carries over to the quantum field case: 


Dip = eR (Dh). (7.128) 
Equation (7.128) implies at once that 
GD, — my! =e #% (in B,, — md, (7.129) 
while taking the conjugate of (7.125) yields 


Tegn, (7.130) 


2Note that the classical field x(a, t) of (2.34) has become a quantum field (æ, t) in (7.125); the sign 
change of ¥ compared with x is conventional in qft. 
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Thus we have 


$ üD, -mý = hele“ iq" D,, — my (7.131) 
bind, — m)ý (7.132) 


and the invariance is proved. 
The Lagrangian has therefore gained an interaction term 


Lp > Lp local = Lp F Lir (7.133) 


where 


Line = -qý Ân. (7.134) 


Since the addition of Lint has not changed the canonical momenta, the Hamiltonian then 
becomes Å = Hp + Hy , where 


Ht = — Ĉint = quy dA, = qh pA a qu ad $ À (7.135) 


which is the field theory analogue of the potential in (3.102). It has the expected form 
‘pAo— jA if we identify the electromagnetic charge density operator with qt wp (the charge 
times the number density operator) and the electromagnetic current density operator with 
qt an. The electromagnetic 4-vector current operator Fin is thus identified as 


jen = apy, (7.136) 


which is gauge invariant and a Lorentz 4-vector. The Lagrangian (7.134) is manifestly 
Lorentz invariant. . 
We now note that j% is just q times the symmetry current Ny of section 7.2 (see equa- 


tion (7.50)). Conservation of j#, would follow from global U(1) invariance alone (i.e. % a 
constant in equation (7.125)); but many Lagrangians, including interactions, could be con- 
structed obeying this global U(1) invariance. The force of the local U(1) invariance require- 
ment is that it has specified a unique form of the interaction (i.e. Lint of equation (7.134)). 
Indeed, this is just -jf Au, so that in this type of theory the current j4, is not only a 
symmetry current, but also determines the precise way in which the vector potential AM 
couples to the matter field w. Adding the Lagrangian for the A field then completes the 
theory of a charged fermion field interacting with the Maxwell field. In a general gauge, the 
A" field Lagrangian is the operator form of (7.119), Ĉe. 

The interaction term He = quybAy is a ‘three-fields-at-a-point’ kind of interaction 
just like our 3-scalar interaction go Adpec in chapter 6. We know, by now, exactly what all 
the operators in Ht are capable of: some of the possible emission and absorption processes 
are shown in figure 7.3. Unlike the ‘ABC’ model with mc > ma + mg however, none of 
these elementary ‘vertex’ processes can occur as a real physical process, because all are 
forbidden by the requirement of overall 4-momentum conservation. However, they will of 
course contribute as virtual transitions when ‘paired up’ to form Feynman diagrams, such 
as those in figure 7.4 (compare figures 6.4 and 6.5). 

It is worth remarking on the fact that the ‘coupling constant’ q is dimensionless, in our 
units. Of course, we know this from its identification with the electromagnetic charge in this 
case (see appendix C). But it is instructive to check it as follows. A Lagrangian density has 
mass dimension M4, since the action is dimensionless (with A = 1). Referring then to (7.33) 
we see that the (mass) dimension of the @ field is M3/?, while (7.67) shows that that of A” 


is M. It follows that wybA, has mass dimension M*, and hence q must be dimensionless. 
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et 


FIGURE 7.3 a 
Possible basic vertices associated with the interaction density epy"WA,; these cannot occur 
as physical processes due to energy-momentum constraints. 


The application of the Dyson formalism of chapter 6 to fermions interacting via ree leads 
directly to the Feynman rules for associating precise mathematical formulae with diagrams 
such as those in figure 7.4, as usual. This will be presented in the following chapter: see 
comment (3) in section 8.3.1 and appendix L. We may simply note here that a “«%’ appears 
along with a ‘q)’ in ae so that the process of ‘contraction’ (cf chapter 6) will lead to the 


form (0|T'(eb(21)%(22))|0) of the Dirac propagator, as stated in section 7.2. 
In the same way, the global U(1) invariance (7.22) of the complex scalar field may be 
generalized to a local U(1) invariance incorporating electromagnetism. We have 


Lea > Lxa + Lint (7.137) 
where 
Lica = ô, "$ — mbt (7.138) 
and (under 0, > D,) 
Lint = —iq(dt and — (GiG) Â, + PAM A, 36 (7.139) 


FIGURE 7.4 
Lowest-order contributions to ye~ + ye7. 
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which is the field theory analogue of the interaction in (3.100). The electromagnetic current 
is 
jin = -ôm /OA, (7.140) 


as before, which from (7.139) is 
Jia = igloo — (OHG )G) — 20 Â" GiG. (7.141) 


We note that for the boson case the electromagnetic current is not just q times the (number) 
current Ng appropriate to the global phase invariance. This has its origin in the fact that the 
boson current involves a derivative, and so the gauge invariant boson current must develop 
a term involving A” itself, as is evident in (7.141), and as we also saw in the wavefunction 
case (cf equation (2.40)). The full scalar QED Lagrangian is completed by the inclusion of 
Le as before. 

The application of the formalism of chapter 6 is not completely straightforward in this 
scalar case. The problem is that Le OE (7.139) involves derivatives of the fields and, in 
particular, their time derivatives. Hence the canonical momenta will be changed from their 
non-interacting forms. This, in turn, implies that the additional (interaction) term in the 
Hamiltonian is not just —Lint, as in the Dirac case, but is given by (problem 7.15) 


H's, = —Lin, — (Â?) oh Q. (7.142) 


The problem here is that the Hamiltonian and =f. differ by a term which is non-covariant 
(only A° appears).This seems to threaten the whole approach of chapter 6. Fortunately, 
another subtlety rescues the situation. There is a second source of non-covariance arising 
from the time-ordering of terms involving time derivatives, which will occur when (7.142) 
is used in the Dyson series (6.42). In particular, one can show (problem 7.16) that 


(0|T(A1.6(x1)O2,6" (£2))10) 
= 044,02, (0|T(b(x1)4"(x2))|0) — iguogv0d*(x1 — x2) (7.143) 


which also exhibits a non-covariant piece. A careful analysis (Itzykson and Zuber 1980, 
section 6.1.4) shows that the two covariant effects exactly compensate, so that in the Dyson 
series we may use H's = Ai after all. The Feynman rules for charged scalar electrody- 
namics are given in appendix L. 


7.5 P, C, and T in Quantum field theory 


We end this chapter by completing the discussion of the discrete symmetries which we began 
in section 4.2, extending it from the single particle (wavefunction) theory to quantum fields. 
We begin with the parity transformation. 


7.5.1 Parity 


The algebraic manipulations of section 4.2.1 apply equally well to the equations of motion for 
the quantum field, and we can take over the results by replacing a transformed wavefunction 
such as Yp(æ,t) by the corresponding transformed field p (æ, t) = Êy(x,t)Ê-! where Ê 
is a unitary quantum field operator (which we shall not need to calculate explicitly). Thus 
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we have 
op (a, t) = o(—2, t) (7.144) 
pplz, t) = Byp(-z2,t), (7.145) 

for the KG and Dirac fields, and 
Ap(a,t) =—A(-a,t), A%(a,t) = A°(—a, t) (7.146) 


for the electromagnetic fields. In (7.144) — (7.146) a simple choice of phase factor has been 
made. 

There is however one new feature in the quantum field case, which is that the commuta- 
tion or anticommutation relations must be left unchanged by the transformation, if it is to 
be an invariance of the theory. Evidently for P the only non-trivial case is the Dirac field, 
and it is easy to check that the anticommutation relations (7.44) and (7.45) are invariant 
under (7.145). 

Let us see the effect of P on the free particle expansion (7.35). Equation (7.145) becomes 


bp(x, t) = 


3k k r y 
J an 2 PKP ulh ser eR 4 PARP olh, seke] 


i k)Bu(k, e ikE + dt (k)Bu(k, sjettik- 2), (7,147) 


s=1,2 


e Qn ae 
Changing k to —k in the second integral and using the spinor properties 
Bu((w,—-k),s) =ulk,s),  Bv((w,—-k), s) = —v(k, s) (7.148) 
in the right-hand side of (7.147), we obtain the conditions 
Pé,(k)P~! = @(w,—-k), Pdt(k)P~! = -di (w,—k) (7.149) 


with similar ones for ĉl and d,. Since él creates a fermion from the vacuum and di creates 
its antiparticle, it follows that a fermion and its antiparticle have opposite intrinsic parities. 
Similarly, equation (7.146) shows, when applied to the expansion (7.104), that a physical 
(transverse) photon has negative intrinsic parity. 

Turning now to the electromagnetic interaction, it is clear that j£, (x) = q(x \ytb(a ) 
has exactly the same transformation properties under P as pytep(a ) had—namely j? (a) 
is a scalar and Jem (2 ) is a polar vector. Since this is also the way A" transforms, according 

o (7.146), it follows that the interaction —j/ Ass is parity invariant, as we expect for QED. 
The scalar interaction (7.139) is also parity invariant. 


7.5.2 Charge conjugation 


The discussion of C proceeds similarly, the transformation being represented by a unitary 
quantum field operator C such that 


Ĉĉ = ĝt (7.150) 
ù TI = iit (7.151) 
ĉ Â! ôI = -Âr (7.152) 
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in the three cases of interest. Note that in terms of the decomposition (7.15) of the complex 
field ¢ into the two real fields ¢; and $2, (7.150) reads 


C($1 — id2)O7* = 1 + ide. (7.153) 


The reader may check (problem 7.17(a)) that the Dirac field anticommutation relations are 
invariant under (7.151). 
Applying (7.150) to the free field expansion (7.16), we easily find 


Ĉâ(k)T! = b(k), bI (k)T! = at (k), (7.154) 


so that particle and antiparticle operators are interchanged. The conditions (7.154) are of 
course consistent with (7.153). It follows that the normally ordered H of (7.21) is even under 
C, while the normally ordered number density (7.24) is odd—the ordering being with Bose 
commutation relations. Carrying out the same steps for the Dirac field, and using the spinor 
relations (4.95) and (4.96), we obtain 


Cé,(k)C7 = ds(k), Cd} (k)T! = ĉl (k); (7.155) 


particle and antiparticle operators are again interchanged. We particularly note that the 
Dirac Hamiltonian (7.55) is even under C, while the Dirac number operator (7.54) is odd, 
in both cases after normal ordering with anticommutation relations (Fermi statistics). The 
reader may check (problem 7.17(b)) that the electromagnetic current density q(x)" ý (z) 
is odd under C, when normally ordered, and so the interaction jt Ân is C-invariant. The 
same is true for the KG case, after normal ordering using Bose statistics. 

In section 4.2.2 we introduced self-conjugate (Majorana) spinors. In extending that dis- 
cussion to quantum field theory, it is again convenient to use the alternative representation 
(3.40) for the Dirac matrices, since we can then read off the Lorentz transformation prop- 
erties from the results of section 4.1.2. Consider the 4-component Majorana field 


. ArT 
? —io2X"" (x) ) 
x)= 7 : 7.156 
It is easy to check from (4.19) and (4.42) that the quantity o2x* (x) transforms like a ¢- 
type spinor, and so the construction (7.156) is consistent with Lorentz covariance. The 
C-conjugate field is 

j 23T a f 0 io —ioag(z) \ 3 

ho =T = (42, e (SRG?) = due, ras) 
showing that it is self-conjugate. It is clear that the Majorana field has only two independent 
degrees of freedom—those in ¥(a)—in contrast to the Dirac field which has four (we could 
of course have equally well constructed a Majorana field using a ¢-type spinor field instead 
of a x-type one). The latter corresponds physically to fermion and antifermion, spin up and 
down, but the Majorana fermion is the same as its antiparticle. The free field expansion 
corresponding to (7.35) for a Majorana field is 


a d?k a —ik-a a ik-ax 
bute) = | pya Z DOEA ke 4 êt (k)u(k, Aje**). (7.158) 


The Lagrangian for a free Majorana field may be taken to be dy (id — m)ým, which the 
reader can rewrite in terms of ¥. For example, the mass term is 
-mýuým = —m%"io2% + Hermitian conjugate. (7.159) 


We note that this expression will vanish unless the components y; ando anticommute with 
each other. 
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7.5.3 Time reversal 


In section 4.2.4 we found that the time reversal transformation for the single particle theories 
was not represented by a unitary operator, but rather by the product of a unitary operator 
and the complex conjugation operator. We can see that the same must be true in quantum 
field theory by considering the equation of motion (6.18) for a scalar field (for simplicity), 
in the interaction picture: 
O(a, t) 
ot 


Suppose the field or in the time reversed frame were related to $ by a unitary quantum 
field operator Ur so that (suppressing the spatial argument) Urde(t)Ut = or(t’). Then 
applying Ur... Uh to equation (7.160) we would obtain 


ddr(t’) 


= ilfo, d(a, t)]. (7.160) 


oy = Ur toUy, er) (7.161) 
or equivalently 
art a ee 
or = ~i[UrHoUy, dr(’)]. (7.162) 


To restore (7.162) to the form (7.160)—i.e. for covariance to hold—would require that Ur 
transforms Ho to — Ho. But this is unacceptable on physical grounds, because the eigenvalues 
of Ho must be positive relative to the vacuum, both before and after the transformation. 
We must therefore write the transformation as 


T=UrkK (7.163) 


where, as in section 4.2.4, K takes the complex conjugate of ordinary numbers and functions 
(i.e. it replaces i by —i). The operator Ûr depends on the field involved, but we shall not 
need to exhibit it explicitly. 

We must now decide how the fields transform under T. We can be guided by our work in 
section 4.2.4 in the single particle theory, remembering that a wavefunction is the vacuum 
to one particle matrix element of the corresponding quantum field operator (see Comment 
(5) in section 5.2.5), and also that matrix elements of operators and their time-reversed 
transforms are related by (4.126). In the case of the KG field, for example, let us take in 
(4.126) < Y2 | =< 0|, Ô = o(x), and |W >= |a;p > for the state of one ‘a’ particle with 
4-momentum p. Then (4.126) gives 


(x) =< 0|9(x)|a; E, p >=< Or|TO(x) Ta; E, —p >*, (7.164) 


where (x) is the free particle solution exp(—iEt + ip - x)/(2E)!/?. Now in section 4.2.4 we 
found the result ġr (x,t) = d*(a, —t), for the time-reversed solution. This will be consistent 
with (7.164) if we take, in the quantum field case, 


Td(x,t)T-! = (a, -t), (7.165) 


assuming that the vacuum is invariant. Applying (7.165) to the free field expansion (4.5) 
gives 


Tole, H) Èt = 
J d?k Ta at iwt-ik-£ rr ît at —iwt+ik-x 
= ĝ(æ, —t) = J aes + bt (he wt-ik- x) (7.167) 
T w 
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Note that the plane wave functions have been complex conjugated in (7.166), because T 
contains K. Changing k to —k in the integral in (7.167), we obtain the conditions 


Ura(w, kU) = a(w,—k), Ûrb'(w, k)U) = btw, —k). (7.168) 
T T 


The transformation preserves particle and antiparticle, and reverses the 3-momentum in 
the creation and annihilation operators. 
For the Dirac theory, we take, similarly, 


Tle, HT = iaya3p(a, —t) (7.169) 


as suggested by (4.118). The reader may check that the anticommutation relations are left 
invariant by (7.169). Applying (7.169) to the free field expansion (7.35), and taking the 
spinors to be helicity eigenstates as in section 4.2.5, we obtain the conditions 


Uré,(w, k)UL = &(w,-k), Ûrd (w, k)i = di (w, —k). (7.170) 


Once again, the 3-momentum has been reversed in the creation and annihilation operators. 


Let us check the behaviour of the current density j£ (x) = qib(x)y")(x) under the 
transformation (7.169). Recalling that in the standard representation iaja3 = “2, we find 


Tio (air? = Jen (®, —t) 
Tjo(@,t)T ! = qit(a, -t)X20*X27)(x, —t) = -Jen (£, —t). (7.171) 


This is exactly how A(x), and hence A" (£x), transforms, and hence the electromagnetic 
interaction jt Ay is T-invariant. The same is true in the KG case. 

We may now proceed to look at some simple processes in scalar and spinor electrody- 
namics, in the following two chapters. 


Problems 


7.1 Verify that the Lagrangian £ of (7.1) is invariant (i.e. £(d1,¢2) = £(¢),,¢4)) under the 
transformation (7.2) of the fields (¢1, 62) + (%1, 64). 


7.2 


(a) Verify that, for NY given by (7.23), the corresponding No of (7.14) reduces to 
the form (7.24); and that, with H given by (7.21), 


WÑ, H] = 0. 


(b) Verify equation (7.27). 


7.3 Show that . : 
lolz), $ (x2)] =0 for (x1 — x2) < 0 


[Hint: insert expression (7.16) for the ¢’s and use the commutation relations (7.18) to express 
the commutator as the difference of two integrals; in the second integral, x; — x2 can be 
transformed to —(a1 — z2) by a Lorentz transformation—the time-ordering of space-like 
separated events is frame-dependent!]. 
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7.4 Verify that varying yt in the action principle with Lagrangian (7.34) gives the Dirac 
equation. 


7.5 Verify (7.44). 

7.6 Verify equations (7.52) and (7.53). 

7.7 Verify (7.62). 

7.8 Verify the expression given in (7.64) for X u(k, syū(k, s). [Hint: first, note that u is 


S 
a four-component Dirac spinor arranged as a column, while ŭ is another four-component 
spinor but this time arranged as a row because of the transpose in the t symbol. So ‘uw’ 
has the form 


U1 (ty U2 U3 tis ) uŭ uŭ 
u2 = | u2ŭı uzü2 
u3 ‘ š 
U4 


Verify that 
1 0 
1 lt 242} 
get + eo = a 


Similarly, verify the expression for 5 u(k, s)ū(k, s). 


7.9 Verify the result quoted in (7.63) for the Feynman propagator for the Dirac field. 


7.10 Verify that if £ = —4 Fp F" — jh, A,, where Fuy = 0 Ay — 0, Ap, the Euler-Lagrange 
equations for A, yield the Maxwell form 


A" — OF (0, A”) = jh. 

Hint: it is helpful to use antisymmetry of F, to rewrite the ‘F - F’ term as —4F,,, 0A”. 
y y H lp 

7.11 


(a) Show that the Fourier transform of the free-field equation for A, (i.e. the one in 
the previous question with j#,„ set to zero) is given by (7.90). 
(b) Verify (7.94). 
7.12 Show that the equation of motion for A,,, following from the Lagrangian £z of (7.97) 
is 


At =0. 


7.13 Verify equation (7.118). 
7.14 Verify equations (7.120), (7.121), and (7.122). 


7.15 Verify the form (7.142) of the interaction Hamiltonian, H’,, in charged spin-0 electro- 
dynamics. 


7.16 Verify equation (7.143). 
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7.17 


(a) Check that the anticommutation relations (7.44) and (7.45) are left invariant 
under (7.151). 


(b) Check that the Dirac electromagnetic current density w(a)yb(a) is odd under 
C when normally ordered. (Hint: the normally ordered current can be written as 


Tieri) 
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8 


Elementary Processes in Scalar and Spinor 
Electrodynamics 


8.1 Coulomb scattering of charged spin-0 particles 


We begin our study of electromagnetic interactions by considering the simplest case, that of 
the scattering of a (hypothetical) positively charged spin-0 particle ‘st’ by a fixed Coulomb 
potential, treated as a classical field. This will lead us to the relativistic generalization of 
the Rutherford formula for the cross section. We shall use this example as an exercise to 
gain familiarity with the quantum field-theoretic approach of chapter 6, since it can also 
be done straightforwardly using the ‘wavefunction’ approach familiar from non-relativistic 
quantum mechanics, when supplemented by the work of chapter 3. We shall also look at 
‘s-’ Coulomb scattering, to test the anti-particle prescriptions of chapter 3. Incidentally, 
we call these scalar particles s* to emphasize that they are not to be identified with, for 
instance, the physical pions 7~, since the latter are composite (qq) systems, and hence their 
interactions are more complicated than those of our hypothetical ‘point-like’ s* (as we shall 
see in section 8.4). No point-like charged scalar particles have been discovered, as yet. 


8.1.1 Coulomb scattering of st (wavefunction approach) 


Consider the scattering of a spin-0 particle of charge e and mass M, the ‘s*’, in an elec- 
tromagnetic field described by the classical potential A”. The process we are considering 
is 

s*(p) > s*(p’) (8.1) 
as shown in figure 8.1, where p and p’ are the initial and final 4-momenta, respectively. The 
appropriate potential for use in the KG equation has been given in section 3.5: 


Vka = ie(0,A" + A“O,,) — €7 A? (8.2) 


As we shall see in more detail as we go along, the parameter characterizing each order 
of perturbation theory based on this potential is found to be e?/47. In natural units (see 
appendices B and C) this has the value 


1 
= 2 4 a 8.3 
a = e" /4r 137 (8.3) 


for the elementary charge e. œ is called the fine structure constant. The smallness of a is 
the reason why a perturbation approach has been very successful for QED. 
To lowest order in a we can neglect the e?.A? term and the perturbing potential is then 


V = ie(0,,A" + AXO,). (8.4) 
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FIGURE 8.1 
Coulomb scattering of s+. 


For a scattering process we shall assume! the same formula for the transition amplitude as 
in non-relativistic quantum mechanics (NRQM) time-dependent perturbation theory (see 
appendix A, equations (A.23) and (A.24)): 


An = “i fate “Ve (8.5) 
where ¢ and ¢’ are the initial and final state free-particle solutions, respectively. The latter 
are (recall equation (3.11)) 

ọ = Nee (8.6) 
= New (8.7) 


and we shall fix the normalization factors later. Inserting the expression for V into (8.5), 
and doing some integration by parts (problem 8.1), we obtain 


Ag+ = —i f d'z {iel (0,8) — (8,6) 6} A". (3.8) 


The expression inside the braces is very reminiscent of the probability current expression 
(3.20). Indeed we can write (8.8) as 


ee J diz jÉ + (a)Ay (2) (8.9) 


where 

Iha s (£) = iel a" e — (O"S"")@) (8.10) 
can be regarded as an electromagnetic ‘transition current’, analogous to the simple prob- 
ability current for a single state. In the following section we shall see the exact meaning 


of this idea, using quantum field theory. Meanwhile, we insert the plane-wave free-particle 
solutions (8.6) and (8.7) for ¢ and ¢’ into (8.10) to obtain 


jen gt (2) = NN'e(p+ pte?) (8.11) 
so that (8.9) becomes 
Ay = -iNN' f ats e(p + p')pe ~i??? AH (ar). (8.12) 


In the case of Coulomb scattering from a static point charge Ze (e > 0), the vector 
potential A” is given by 
Ze 
A’ = A=0. 1 
Ae (8.13) 


1 Justification may be found in chapter 9 of Bjorken and Drell (1964). 
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Inserting (8.13) into (8.12) we obtain 


: ; i(p—p’)-x 
Ast = —iNN'Ze?(E + E’) I ged / oo a. (8.14) 
4 |a| 
The initial and final 4-momenta are 


p=(E,p) p =(F',p') 


with E = y M2 + p?, E' = y M2 + p”. The first (time) integral in (8.14) gives an energy- 
conserving ĝ-function 2rô(E— E') (see appendix E), as is expected for a static (non-recoiling) 
scattering centre. The second (spatial) integral is the Fourier transform of 1/4r|æ|, which 
can be obtained from (1.13), (1.26), and (1.27) by setting my = 0; the result is 1/q? where 
q = p — p'. Hence 


Z 2 
Aj = —iNN’2n6(E — E) L 2E (8.15) 
q? 


—i(2r)ð(E — E') V+ (cf equation (A.25)) (8.16) 


where in (8.15) we have used # = E” in the matrix element. This is in the standard form 
met in time-dependent perturbation theory (cf equations (A.25) and (A.26)). 
The transition probability per unit time is then (appendix H, equation (H.18)) 


È = 2r|V |? P(E') (8.17) 


where p(E') is the density of final states per energy interval dE’. This will depend on the 
normalization adopted for ¢, ¢’ via the factors N,N’. We choose these to be unity, which 
means that we are adopting the ‘covariant’ normalization of 2E particles per unit volume. 
Then (cf equation (H.22)) 


Iè dlp 
Bag = 1P dQ. A 
p(E)d (27)? QE’ (8.18) 
Using E’ = (M? + p’”)'/? one easily finds 
n _ |p’ de 
oz) = ES. (8.19) 


Note that this differs from equation (H.22) since here we are using relativistic kinematics. 
To obtain the cross section, we need to divide P,+ by the incident flux, which is 2|p| in 
our normalization. Hence 


do = (4Z7e4 E? /16n2q*) dQ. (8.20) 


Finally, since q? = (p — p’)? = 4|p|? sin? 0/2 (cf section 1.3.4) where @ is the angle between 
p and p’, we obtain 


E? 1 
T (Zap = ——. 
T A|p|4 sin’ 0/2 


(8.21) 


This is the Rutherford formula with relativistic kinematics, showing the characteristic 
sin‘ 6/2 angular dependence (cf figure 1.8). This deservedly famous formula will serve 
as a ‘reference point’ for all the subsequent calculations in this chapter, as we proceed to 
add in various complications, such as spin, recoil, and structure. The non-relativistic form 
may be retrieved by replacing E by M. 
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8.1.2 Coulomb scattering of st (field-theoretic approach) 


We follow steps closely similar to those in section 6.3.1, making use of the result quoted in 
section 7.4, that the appropriate interaction Hamiltonian for use in the Dyson series (6.42) 
is H, = —Ling where Ling is given by (7.139), with q = e. As in the step from (8.2) to (8.4) 
we discard the e? term to first order and use 


Hy (a) = ie(G" (x)0" G(x) — (0"4" (x) b(a)) Ay (a). (8.22) 


Equation (8.22) can be written as Fens Ap where 


Jms = iela" G — (O"G")d). (8.23) 
Note that the field A,, is not quantized; it is being treated as an ‘external’ classical potential. 
The expansion for the field ¢ is given in (7.16). As in (6.48), the lowest-order amplitude is 
Ace = i(t, p] f ate H(a)\s*,p) (8.24) 

where (cf (6.49)) 
|s*,p) = V2E4a! (p)|0). (8.25) 


We are, of course, anticipating in our notation that (8.24) will indeed be the same as (8.12). 
The required amplitude is then 


Ag+ E -i fate (st, p!jun.s(2)|8*,p) A, (2). (8.26) 


Using the expansion (7.16), the definition (8.25) and the vacuum conditions (7.30), and 
following the method of section 6.3.1, it is a good exercise to check that the value of the 
matrix element in (8.26) is (problem 8.2) 


Toa = e(p t+ p' he PP), (8.27) 


This is exactly the same as the expression we obtained in (8.11) for the wave mechanical 
transition current in this case, using the normalization N = N’ = 1, which is consistent with 
the field-theoretic normalization in (8.25). Thus our wave mechanical transition current is 
indeed the matrix element of the field-theoretical electromagnetic current operator: 


Jenar (x) = (sT, p'|Jțns(æ)lst, p). (8.28) 


Combining all these results, we have therefore connected the ‘wavefunction’ amplitude and 
the ‘field-theory’ amplitude via 


Ast —i J d*s je, s+ (2)Aa(2) 


“i f ate (st, p' in ala)le*, p} Ala): (8.29) 


We note that because of the static nature of the potential, and the non-covariant choice of 
A“ (only A? Æ 0), our answer in either case cannot be expected to yield a Lorentz invariant 
amplitude. 
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SL S237 Sg gre? 
a F + de 
s a = r =p Bs a = p' 
P P = 
(a) (b) 


FIGURE 8.2 

Coulomb scattering of s~: (a) the physical process with anti-particles of positive 4- 
momentum and (b) the related unphysical process with particles of negative 4-momentum 
using the Feynman prescription. 


8.1.3 Coulomb scattering of s~ 
The physical process is (figure 8.2(a)) 


s~ (p) +s (p’) (8.30) 


where, of course, E and F’ are both positive (E = (M? + p?)!/? and similarly for E’). 
Since the charge on the anti-particle s~ is —e, the amplitude for this process can, in fact, 
be immediately obtained from (8.12) by merely changing the sign of e. Because of the way 
e and the 4-momenta p and p’ enter (8.12), however, this in turn is the same as letting 
p > —p' and p' > —p; this changes the sign of the ‘e(p + p’),,’ part as required, and 
leaves the exponential unchanged. Hence we see in action here (admittedly in a very simple 
example) the Feynman interpretation of the negative 4-momentum solutions, described in 
section 3.4.4: the amplitude for s7 (p) — s7 (p') is the same as the amplitude for s*(—p’) > 
st(—p). The latter process is shown in figure 8.2(b). 

The same conclusion can be derived from the field-theory formalism. In this case we 
need to evaluate the matrix element 


(s~,p'libm,s(2)I8” »P), (8.31) 


where the same Jams of equation (8.23) enters: ĝ of (7.16) contains the anti-particle operator 
too! It is again a good exercise to check, using 


ls7, p) = V2E 6 (p)|0) (8.32) 


and remembering to normally order the operators in Jonsi that (8.31) is given by the 
expected result, namely, (8.27) with e + —e (problem 8.3). 

Since the matrix elements only differ by a sign, the cross sections for s* and s~ Coulomb 
scattering will be the same to this (lowest) order in a. 


E e 
8.2 Coulomb scattering of charged spin-+ particles 


8.2.1 Coulomb scattering of e~ (wavefunction approach) 


We shall call the particle an electron, of charge —e(e > 0) and mass m; note that by 
convention it is the negatively charged fermion that is the ‘particle’, but the positively 
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FIGURE 8.3 
Coulomb scattering of e7. 


charged boson. The process we are considering is (figure 8.3) 
e (k,s) +e (k', s") (8.33) 


where k’ and s’ are the 4-momentum and spin of the incident e~ , respectively, and similarly 
for k',s', with k = (E, k) and E = (m? + k?)!/? and similarly for K’. 
The appropriate potential to use in the Dirac equation has been given in section 3.5: 


n 0 3 
M= -A1 bea A= -e f a) 


ay ae (8.34) 


for a particle of charge —e. This potential is a 4 x 4 matrix and to obtain an amplitude in 
the form of a single complex number, we must use Yt instead of w* in the matrix element. 
The first-order amplitude (figure 8.3) is therefore 


Ae- = =i f ataw (es Voutks) (8.35) 


where s and s’ label the spin components. The spin labels are necessary since the spin con- 
figuration may be changed by the interaction. In (8.35), Y and yw” are free-particle positive- 
energy solutions of the Dirac equation, as in (3.74), with u given by equation (3.73) and 
normalized to utu = 2E, E = (m? + k?)1/?, 

The Lorentz properties of (8.35) become much clearer if we use the y-matrix notation 
of problem 4.3. For convenience we re-state the definitions here: 


~=6B (P =1 (8.36) 
Y = Pai (7)? =-1 i = 1,2,3. (8.37) 


The Dirac equation may then be written (problem 4.3) as 
(id -myy =0 (8.38) 


where the ‘slash’ notation introduced in (7.59) has been used (if = iy"0,,). Defining y = 
wiy°, (8.35) becomes 


A = =i f ate Cep aya) Ale) (8.39) 


= =i fates. (x)A (x) (8.40) 


where we have defined an electromagnetic transition current for a negatively charged 
fermion: 


jÉ e (2) = -e9 (z) Y2), (8.41) 
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exactly analogous to the one for a positively charged boson introduced in section 8.1.1. We 
know from section 4.1.2 that ¢’y"~ is a 4-vector, showing that A- of (8.40) is Lorentz 
invariant. 

Inserting free-particle solutions for Yy and yt in (8.41), we obtain 


Ibne- (£) = —eu(k’, s')y ulk, sjeik) (8.42) 
so that (8.39) becomes 
Ae- = -i f ata (—eu'y uei k-E)2) A (x) (8.43) 


where u = u(k, s) and similarly for u’. Note that the u’s do not depend on x. For the case 
of the Coulomb potential in equation (8.13), Ae- becomes 


Z 2 
A- = i2n6(E — Bul (8.44) 


just as in (8.15), where q = k — k’ and we have used u'y° = ut. Comparing (8.44) with 
(8.15), we see that (using the covariant normalization N = N’ = 1) the amplitude in the 
spinor case is obtained from that for the scalar case by the replacement ‘2E > wtw and 
the sign of the amplitude is reversed as expected for e~ rather than s* scattering. 

We now have to understand how to define the cross section for particles with spin and 
then how to calculate it. Clearly the cross section is proportional to |A,—|?, which involves 
jut (k’, s’)u(k, s)|? here. Usually the incident beam is unpolarized, which means that it is a 
random mixture of both spin states s (‘up’ or ‘down’). It is important to note that this 
is an incoherent average, in the sense that we average the cross section rather than the 
amplitude. Furthermore, most experiments usually measure only the direction and energy 
of the scattered electron and are not sensitive to the spin state s’. Thus what we wish to 
calculate, in this case, is the unpolarized cross section defined by 


do = $ (doy + dor, + doy + do) 


= 1 5 doss (8.45) 


where dogs « |ut(k’, s’)u(k, s)|?. In (8.45), we are averaging over the two possible initial 
spin polarizations and summing over the final spin states arising from each initial spin state. 
It is possible to calculate the quantity 


S= u (8.46) 


by brute force, using (3.73) and taking the two-component spinors to be, say, 


#=(5) e=(9). (8.47) 


S = (2E)?(1 — v? sin? 6/2) (8.48) 


One finds (problem 8.4) 


where v = |k|/E is the particle’s speed and @ is the scattering angle. If we now recall that 
(a) the matrix element (8.44) can be obtained from (8.15) by the replacement ‘2E > utu 
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and (b) the normalization of our spinor states is the same (‘p = 2E’) as in the scalar case, 
so that the flux and density of states factors are unchanged, we may infer from (8.21) that 


dē (Za)? E? (1—v? sin? 0/2) 
4\k|* — sin* 0/2 


707 (8.49) 


This is the Mott cross section (Mott 1929). Comparing this with the basic Rutherford for- 
mula (8.21), we see that the factor (1—v? sin? 0/2) (which comes from the spin summation) 
represents the effect of replacing spin-0 scattering particles by spin-4 ones. 

Indeed, this factor has an important physical interpretation. Consider the extreme rel- 
ativistic limit (v > 1,m — 0), when the factor becomes cos? 0/2, which vanishes in the 
backward direction 6 = m. This may be understood as follows. In the m — 0 limit, it is 
appropriate to use the representation (3.40) of the Dirac matrices and, in this case equations 
(4.14) and (4.15) show that the Dirac spinor takes the form 


u= e (8.50) 


where ur and uz, have positive and negative helicity, respectively. The spinor part of the 
matrix element (8.44) then becomes uh UR + ul tur, from which it is clear that helicity is 
conserved: the helicity of the u’ spinors equals that of the u spinors; in particular, there 
are no helicity mixing terms of the form uk UL or ul tur. Consider then an initial state 
electron with positive helicity, and take the z-axis to be along the incident momentum. 
The z-component of angular momentum is then +4. Suppose the electron is scattered 
through an angle of m. Since helicity is conserved, the scattered electron’s helicity will 
still be positive, but since the direction of its momentum has been reversed, its angular 
momentum along the original axis will be —t. Hence this configuration is forbidden by 
angular momentum conservation—and similarly for an incoming negative helicity state. 
The spin labels s’, s in (8.46) can be taken to be helicity labels and so it follows that the 
quantity S must vanish for 6 = 7 in the m — 0 limit. The ‘R’ and ‘L’ states are mixed by 
a mass term in the Dirac equation (see (4.14) and (4.15)) and hence we expect backward 
scattering to be increasingly allowed as m/E increases (recall that v = (1 — m?/E?)'/? so 
that 1 — v? sin? 0/2 = cos? 0/2 + (m?/E?) sin? 0/2). 


8.2.2 Coulomb scattering of e` (field-theoretic approach) 
Once again, the interaction Hamiltonian has been given in section 7.4, namely 
Hp = -eby GA, = ae (8.51) 


where the current operator j/, e is just ~egi} in this case. The lowest-order amplitude 
is then 


Age TL (x)|e~, k, s) (8.52) 
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With our normalization, and referring to the fermionic expansion (7.35), the states are 
defined by 2s 
le~,k, s) = V2Eét(k)|0) (8.54) 
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and similarly for the final state. We then find (problem 8.5) that the current matrix element 
n (8.53) takes the form 
(e7, k, s'|74,.(@)le, k, s) = —eu' qg ue Ere = jh e (a) (8.55) 


exactly as in (8.42). Thus once again, the ‘wavefunction’ and ‘field-theoretic’ approaches 
have been shown to be equivalent, in a simple case. 


8.2.3 Trace techniques for spin summations 


The calculation of cross sections involving fermions rapidly becomes laborious following the 
‘brute force’ method of section 8.2.1, in which the explicit forms for u and u’? were used. 
Fortunately we can avoid this by using a powerful labour-saving device due to Feynman, in 
which the y’s come into their own. 

We need to calculate the quantity S given in (8.46). This will turn out to be just the 
first in a series of such objects. With later needs in mind, we shall here calculate a more 
general quantity than (8.46), namely the lepton tensor 


LY’ (kk) = 5 22 (k', s’)y"u(k, s)[a(k’, s y ulk, s)]* (8.56) 
1 “ š 
= 2e2 XO (e7, K, 8'ihn e(0)le7, k, 8) (e7, k’, s'|J&m (0)le7, k, s)*. (8.57) 


Clearly this will be relevant to the more general case in which A” contains non-zero spatial 
components, for example. For our present application, we shall need only L®. 

We first note that L#” is correctly called a tensor (a contravariant second-rank one, 
in fact—see appendix D) because the two ‘ty"u, tiy’w’ factors are each 4-vectors, as we 
have seen. (We might worry a little over the complex conjugation of the second factor, but 
this will disappear after the next step.) Consider therefore the factor [a(k’, s’)y’u(k, s)]*. 
For each value of the index v, this is just a number (the corresponding component of the 
4-vector), and so it can make no difference if we take its transpose, in a matrix sense (the 
transpose of a 1 x 1 matrix is certainly equal to itself!). In that case the complex conjugate 
becomes the Hermitian conjugate, which is: 


ak, s'y ulk, st = ul(k, sjy ty ulk, s’) (8.58) 
= ene (8.59) 

since (problem 8.6) 
Paty? =" (8.60) 


and y9 = 7°. Thus L“” may be written in the more streamlined form 


=3 = 3a Hulk, s)u(k, s)y’u(k’, s") (8.61) 


which is, moreover, evidently the (tensor) product of two 4-vectors. However, there is more 
to this than saving a few symbols. We have seen the expression 


5 u(k, s)u(k, s) (8.62) 


S 
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before! (See (7.64) and problem 7.8.) Thus we can replace the sum (8.62) over spin states 
‘s? by the corresponding matrix (% + m): 


L” =} X tials 8!) ("oa lb +m) a4(7”)yots(h’, s) (8.63) 


where we have made the matrix indices explicit, and summation on all repeated matrix 
indices is understood. In particular, note that every matrix index is repeated, so that each 
one is in fact summed over; there are no ‘spare’ indices. Now, since we can reorder matrix 
elements as we wish, we can bring the us to the front of the expression, and use the same 
trick to perform the second spin sum: 


So us(k’, 8')ta(k’, s") = (K + m)sa- (8.64) 


Thus LY” takes the form of a matrix product, summed over the diagonal elements: 


L” = LR + m)sa(7")aa (+m) p97”) 96 (8.65) 
= LK +m)" +m) ]ss (8.66) 
ô 


where we have explicitly reinstated the sum over 6. The right-hand side of (8.66) is the 
trace (i.e. the sum of the diagonal elements) of the matrix formed by the product of the 
four indicated matrices: 
v 4 v 
LH” = ETr[(K + m) (k + m)y’]. (8.67) 


Such matrix traces have some useful properties which we now list. Denote the trace of 
a matrix A by 


TA =Y Ai. (8.68) 
Consider now the trace of a matrix product, 


Tr(AB) = X Ajj Bji (8.69) 
i,j 


where we have written the summations in explicitly. We can (as before) freely exchange the 
order of the matrix elements A;; and Bji, to rewrite (8.69) as 


Tr(AB) = X` By Aj. (8.70) 
tJ 


But the right-hand side is precisely Tr(BA); hence we have shown that 
Tr(AB) = Tr(BA). (8.71) 
Similarly it is easy to show that 
Tr(ABC) = Tr(CAB). (8.72) 


We may now return to (8.67). The advantage of the trace form is that we can invoke 
some powerful results about traces of products of y-matrices. Here we shall just list the trace 
‘theorems’ that we shall use to evaluate L’”: more complete statements of trace theorems 
and y-matrix algebra, together with proofs of these theorems, are given in appendix J . 
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We need the following results: 


(a) Trl =4 (8.73) 
(b) Tr (odd number of y’s) = 0 (8.74) 
(0) TH(4B) = 4(a -b) (8.75) 
(d) — Tr(db¢d) = 4[(a- b)(c- d) + (a - d)(b- c) — (a - e)(b- d)]. (8.76) 
Then 

TeK +m) (+ mj”) = Tekk) + mT ("By") 
+mTx(fiyy") + mT”) (8.77) 

The terms linear in m are zero by theorem (b), and using (c) in the form 
Tr(quyw)aXb” = 4guva” b” = 4a-b (8.78) 


and (d) in a similar form, we obtain (problem 8.7) 
LY = Ly! + my (E + m)q”] = 2k" k” + RY (k' - k)g”] + 2m2gt. (8.79) 
In the present case we simply want L°°, which is found to be (problem 7.9) 
L” = 4B? (1 — v? sin? 0/2) (8.80) 


where v = |k|/E, just as in (8.48). 


8.2.4 Coulomb scattering of et 
The physical process is 
et (k,s) + et (k’,s’) (8.81) 


where, as usual, we emphasize that E and E’ are both positive. In the wavefunction ap- 
proach, we saw in section 3.4.4. that, because p > 0 always for a Dirac particle, we had to 
introduce a minus sign ‘by hand’, according to the rule stated at the end of section 3.4.4. 
This rule gives us, in the present case, 


amplitude (et (k, s) > et (k’, s’)) 
= —amplitude (e7 (—k’, —s’) > e7 (—k, —s)). (8.82) 


Referring to (8.43), therefore, the required amplitude for the process (8.81) is 
fas J delatae e a Aa) (8.83) 


since the ‘v’ solutions have been set up precisely to correspond to the ‘—k, —s’ situation. 
In evaluating the cross section from (8.83), the only difference from the e~ case is the 
appearance of the spinors ‘uv’ rather than ‘u’; the lepton tensor in this case is 


LH” = 5 Ts[(k — m) (E — m)y7"] (8.84) 


using the result (7.64) for )>, u(k, s)u(k, s). Expression (8.84) differs from (8.67) by the sign 
of m and by k + k’, but the result (8.79) for the trace is insensitive to these changes. Thus 
the positron Coulomb scattering cross section is equal to the electron one to lowest order 
in a. 
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In the field-theoretic approach, the same interaction Hamiltonian H. which we used for 


eT scattering will again automatically yield the et matrix element (recall the discussion at 
the end of section 8.1.3). In place of (8.53), the amplitude we wish to calculate is 


Act 


II 


i f ata (et, k’, 8’ [jt e(x)let, k, s) A (£x) 
za / d‘a (et, k’, s"| — e@(a)y"B(2)let, k, 8) A,,(2) (8.85) 
where, referring to the fermionic expansion (7.35), 
let, k, s) = V2Edt(k)|0), (8.86) 
and similarly for the final state. In evaluating the matrix element in (8.85) we must again 


remember to normally order the fields, according to the discussion in section 7.2. Bearing 
this in mind, and inserting the expansion (7.35), one finds (problem 8.9) 


(ot Ks jen o(2let. Bs) = -ten(b,s)y*u(k',s/)e KP ¥)* (8.87) 
= jiet) (8.88) 


just as required in (8.83). Note especially that the correct sign has emerged naturally without 
having to be put in ‘by hand’, as was necessary in the wavefunction approach when applied 
to an anti-fermion. 

We are now ready to look at some more realistic (and covariant) processes. 


ey 
8.3 e st scattering 
8.3.1 The amplitude for e~st — est 


We consider the two-body scattering process 
e7 (k, 8) + 8*(p) > e~ (k', s’) +8*(P') (8.89) 


where the 4-momenta and spins are as indicated in figure 8.4. How will the e~ and s* inter- 
act? In this case, there is no ‘external’ classical electromagnetic potential in the problem. 
Instead, each of e~ and st, as charged particles, acts as source for the electromagnetic field, 
with which they in turn interact. We can picture the process as one in which each particle 
scatters off the ‘virtual’ field produced by the other (we shall make this more precise in 
comment (2) after equation (8.102)). The formalism of quantum field theory is perfectly 
adapted to account for such effects, as we shall see. It is very significant that no new inter- 
action is needed to describe the process (8.89) beyond what we already have: the complete 
Lagrangian is now simply the free-field Lagrangians for the spin-4 e7, the spin-0 st and the 
Maxwell field, together with the sum of the lowest order scalar electromagnetic interaction 
Hamiltonian of (8.22), and the Dirac interaction Hamiltonian of (7.135) with q = —e. The 
full interaction Hamiltonian is then 


H'(x) = [ie(d'(w)d"d(a) — 0" 4" (a) G(x) — ed(x)y"b(@)] A, (2) (8.90) 
= (fiaa) + Jhn,e(t)) Ay() 
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FIGURE 8.4 
es? scattering amplitude. 


where the ‘total current’ in (8.91) is just the indicated sum of the ¢ (scalar) and ~ (spinor) 
currents. This H’ must now be used in the Dyson expansion (6.42), in a perturbative 
calculation of the e~st + e~st amplitude. 

Note now that, in contrast to our Coulomb scattering ‘warm-ups’, the electromagnetic 
field is quantized in (8.90). We first observe that, since there are no free photons in either 
the initial or final states in our process e~st — e st, the first-order matrix element of 
Ĥ' must vanish (as did the corresponding first-order amplitude in AB > AB scattering, in 
section 6.3.2). The first non-vanishing scattering processes arise at second order (cf (6.74)): 


Ass = GE f fates ana leo ATR (oH (ea)}a (924 (8) 
x (16 Ey, Ep Ep Ep: )*/. (8.92) 


Just as for AB > AB and the C field in the ‘ABC’ model (cf (6.81)), as far as the Â, 
operators in (8.92) are concerned the only surviving contraction is 


(0|T(A,,(21)A, (z2))10) (8.93) 
which is the Feynman propagator for the photon, in coordinate space. As regards the rest 
of the matrix element (8.92), since the @’s and é’s commute the ‘s+’ and ‘e~’ parts are quite 
independent, and (8.92) reduces to 

—j)? * z à 
CE ff ater dtaa ("Pin slal, POIT(A, (01), (02)10) 
x (e7, k, 8" |Fem,e(@2) le”, kss) + (21 + z2)}. (8.94) 


But we know the explicit form of the current matrix elements in (8.94), from (8.27) and 
(8.55). Inserting these expressions into (8.94) and noting that the term with 21 © zə is 
identical to the first term, one finds (cf (6.102) and problem 8.10) 


Aest = i(2n)404(p + k= pl — k')Me-st (8.95) 
where (using the general form (7.122) of the photon propagator) 


iM.-st = (=i)? (elp +p”) (e t — £)quq/4 l) 


x(—eūul(k',s')y ulk, s)) (8.96) 


(-i)? 54, (p, p') (= F am E)duqv/4q ‘) ia (k, k’) (8.97) 
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and q = (k — k’) = (p' — p). We have introduced here the ‘momentum-space’ currents 


ji (p,p) = elp +p)" (8.98) 


and 
jE (k, k’) = —en(k!, s")y"'ulk, 8) (8.99) 


shortening the notation by dropping the ‘em’ suffix, which is understood. 
Before proceeding to calculate the cross section, some comments on (8.97) are in order: 


Comment (1) 


The j#, (p, p’) and j¥_(k,k') in (8.98) and (8.99) are the momentum-space versions of the 
x-dependent current matrix elements in (8.27) and (8.55); they are, in fact, simply those ma- 
trix elements evaluated at x = 0. The 2-dependent matrix elements (8.27) and (8.55) both 
satisfy the current conservation equations 0,,j"(x) = 0 as is easy to check (problem 8.11). 
Correspondingly, it follows from (8.98) and (8.99) that we have 


dug (p, p') = qajt- (k, k’) = 0 (8.100) 


where q = p' — p = k — k', and we have used the mass-shell conditions p? = p’? = M?, 
ku = mu, Ku! = mw’; the relations (8.100) are the momentum-space versions of current 
conservation. The €-dependent part of the photon propagator, which is proportional to 
qq’, therefore vanishes in the matrix element (8.97). This shows that the amplitude is in- 
dependent of the gauge parameter €—in other words, it is gauge invariant and proportional 
simply to 


.u Juv ep 
ie “pe (8.101) 


Comment (2) 


The amplitude (8.97) has the appealing form of two currents ‘hooked together’ by the photon 
propagator. In the form (8.101), it has a simple ‘semi-classical’ interpretation. Suppose we 
regard the process e~st — est as the scattering of the e~, say, in the field produced by 
the st (we can see from (8.101) that the answer is going to be symmetrical with respect to 
whichever of e~ and st is singled out in this way). Then the amplitude will be, as in (8.43), 


As = i i dt JY (k, kje iOKe A, (a) (8.102) 


where now the classical field A,(x) is not an ‘external’ Coulomb field but the field caused 
by the motion of the st. It seems very plausible that this A,(a) should be given by the 
solution of the Maxwell equations (2.22), with the j,em(x) on the right-hand side given by 
the transition current (8.11) (with N = N’ = 1) appropriate to the motion st(p) > s*(p’): 


A” — O (Ə! An) = j% (x) (8.103) 


where 
jé (x) = elp + p'e O) e (8.104) 


Equation (8.103) will be much easier to solve if we can decouple the components of A” by 
using the Lorentz condition 0A, = 0. We are aware of the problems with this condition 
in the field-theory case (cf section 7.3.2) but we are here treating A” classically. Although 
A” is not a free field in (8.103), it is easy to see that we may consistently take 0“ A, = 0 


est scattering 


FIGURE 8.5 


Feynman diagram for e~st scattering in the one-photon exchange approximation. 
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provided that the current is conserved, 0,j”, (a) = 0, which we know to be the case. Thus 


we have to solve 


A” (x) =e(p + p')"e PP), 


Noting that 


eo i(p—p')-@ ~(p =p Pete rye 


we obtain, by inspection, 
Aa) =- Gep +p eera 
where q = p' — p. Inserting this expression into the amplitude (8.102) we find 
Ae-s+ = i(27)t8 (p + k — p' — k')Me-s+ 


where 


; : Quy ov 
iMe-st = jÉ, (pp) io de (Ek) 
exactly as in (8.97) for £ = 1 (the gauge appropriate to ‘0,,A" = 0’). 


Comment (3) 


(8.105) 


(8.106) 


(8.107) 


(8.108) 


(8.109) 


From the work of chapter 6, it is clear that we can give a Feynman graph interpretation of 
the amplitude (8.109) as shown in figure 8.5, and set out the corresponding Feynman rules: 


(i) At a vertex where a photon is emitted or absorbed by an st particle, the factor 
is —ie(p + p’)* where p and p’ are the incident and outgoing 4-momenta of the 


st, respectively; the vertex for s~ has the opposite sign. 


(ii) At a vertex where a photon is emitted or absorbed by an e7, the factor is iey“(e > 
0); for an et it is —iey. (This and the previous rule arise from associating one 


‘(—i)’ factor in (8.94) or (8.97) with each current.) 


(iii) For each initial state fermion line a factor u(k, s) and for each final state fermion 
line a factor u(k’, 5’); for each initial state anti-fermion a factor (k, s) and for 
each final state anti-fermion line a factor v(k’, s’) (these rules reconstruct the et 


Coulomb amplitudes of section 8.2.4). 


198 Elementary Processes 


(iv) For an internal photon of 4-momentum q, there is a factor —ig,,,/q? in the gauge 


€=1. 
(v) Multiplying these factors together gives the quantity iM; multiplying the result 
by an overall 4-momentum-conserving 6-function factor (27)*6(p! +k! +---—p— 


k —---) gives the quantity A. 


Comment (4) 


We know that our amplitude is proportional to 
au Guy wv 
Js+ qe (8.110) 


Choosing the coordinate system such that q = (¢°,0,0,|q|), the current conservation equa- 
tions q Js+ = + Je- = 0 read: 
J? = ° j’ /lql (8.111) 


for both currents. Expression (8.101) can then be written as 


(GJE + JAIE + G52. L 
= (jaja tee + Isie- / 2 (8.112) 


using (8.111). The first term may be interpreted as being due to the exchange of a trans- 
versely polarized photon (only the 1,2 components enter, perpendicular to q). For real 
photons q? + 0, so that this term will completely dominate the second. The latter, how- 
ever, must obviously be included when q? Æ 0, as of course is the case for this virtual y 
(cf section 6.3.3). We note that the second term depends on the 3-momentum squared, 
q°, rather than the 4-momentum squared q°, and that it involves the charge densities j?, 
and ree Referring back to section 7.1, we can interpret it as the instantaneous Coulomb 
interaction between these charge densities, since 


f zear = [ect /r = 4r/ g. (8.113) 


Thus, in summary, the single covariant amplitude (8.109) includes contributions from the 
exchange of transversely polarized photons and from the familiar Coulomb potential. This 
is the true relativistic extension of the static Coulomb results of (8.15) and (8.44). 


8.3.2 The cross section for e~st — e~st 


The invariant amplitude M,-,+(s,s’) for our process is given by (8.109) as 
Me-s+ (5,5) = eū(k', s')yu(k, 8)(—guv/a" )e(p +p)” (8.114) 


where we have now included the spin dependence of the amplitude Me-s+ in the notation. 
The steps to the cross sections are now exactly as for the spin-0 case (section 6.3.4), as 
modified by the spin summing and averaging already met in sections 8.2.1 and 8.2.3, par- 
ticularly the latter. The cross section for the scattering of an electron in spin state s to one 
in spin state s’ is (cf (6.110)) 

1 
aA 
rT A 

1 dk d3p’ 


dp 8.115 
* TOTE Qu! 2B! as 


(s, s')|/?(2m)*54(k! + p' — k — p) 


dass 
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where we have defined 


kt = (w,k) K” = (w',k’) 
p" =(E,p) pl" =(FE',p'). (8.116) 


For the unpolarized cross section we are required, as in (8.46), to evaluate the quantity 


1 A 
5 2 Me-st(s,8°))? = (5) 5 2 Uk’, s')y"ulk, 8) i(k, s) ulk’, 8") 
x(p+P')u(ptp')y (8.117) 
2\2 
= (G) P&M Toor) (8.118) 


where the boson tensor T,» is just (p + p’) (p +p’), and the lepton tensor L“” has been 
evaluated in (8.79). Using q? = (k — k’)? = 2m? — 2k - k', the expression (8.79) can be 
rewritten as 


LH” (k, k') = 2[K' K” + k” kt + (q?/2)g”]. (8.119) 


We then find (problem 8.12) 
LYT = 8[2(p- K)(p  k') + (q?/2)M?] (8.120) 


since k’ - p' = k- p and k- p' = k' - p from 4-momentum conservation, and p? = p°? = M? 
(we are using m for the e~ mass and M for the st mass). 

We can now give the differential cross section in the CM frame by taking over the formula 
(6.129) with 


IM}? > 3 Se |Me=s+ (s, s")|? 
s,s! 


so as to obtain 
d9 mep 
where a = e? /4r and W? = (k+p)?. 
A somewhat more physically meaningful formula is found if we ask for the cross section 
in the ‘laboratory’ frame which we define by the condition p” = (M, 0). The evaluation of 


the phase space integral requires some care and this is detailed in appendix K. The result 
is 


(E) = mep PO p: K) + (F/M (8.121) 
CM 


do a? 5 k' 
= 6/2). 8.122 
dQ ~ msmo °° OOk ae) 


In this formula we have neglected the electron mass in the kinematics so that 


k = |k|=w (8.123) 
k = |k|=% (8.124) 

and 
q? = —4kk' sin? (0/2) (8.125) 


where @ is the electron scattering angle in this frame, as shown in figure 8.6, and 


(k/k’) = 1+ (2k/M) sin? (0/2) (8.126) 
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FIGURE 8.6 
Two-body scattering in the ‘laboratory’ frame. 


from equation (K.20). Note that there is a slight abuse of notation here; in the context of 
results for such laboratory frame calculations, ‘k’ and ‘k” are not 4-vectors, but rather the 
moduli of 3-vectors, as defined in equations (8.123) and (8.124). 

We shall denote the cross section (8.122) by 


do : 
(5) ‘no-structure’ cross section. (8.127) 
ns 

It describes essentially the ‘kinematics’ of a relativistic electron scattering from a pointlike 
spin-0 target which recoils. Comparing the result (8.122) with equation (8.49), and remem- 
bering that here Z = 1 and we are taking v — 1 for the electron, we see that the effect of 
recoil is contained in the factor (k’/k), in this limit. We recover the ‘no-recoil’ result (8.49) 
in the limit M — oo, as expected. In particular, referring to (8.125), we understand Ruther- 
ford’s ‘sin~+ 0 /2’ factor in terms of the exchange of a massless quantum via the propagator 
factor (1/q?)?. 

This ‘no-structure’ cross section also occurs in the cross section for the scattering of 
electrons by protons or muons: the appellation ‘no-structure’ will be made clearer in the 
discussion of form factors which follows. As in the case of ef Coulomb scattering, the cross 
sections for e~s* and for ets* scattering are identical at this (lowest) order of perturbation 
theory. 


8.4 Scattering from a non-point-like object: the pion form factor 
in ent > e at 
As remarked earlier, we have been careful not to call the ‘st’ particle a m+, because the 
latter is a composite system which cannot be expected to have point-like interactions with 
the electromagnetic field, as has been assumed for the st; rather, in the case of the m* it 
is the quark constituents which interact locally with the electromagnetic field. The quarks 
also, of course, interact strongly with each other via the interactions of QCD, and since these 
are strong they cannot (in this case) be treated perturbatively. Indeed, a full understanding 
of the electromagnetically-probed ‘structure’ of hadrons has not yet been achieved. Instead, 
we must describe the e~ scattering from physical 7+’s in terms of a phenomenological 
quantity—the pion form-factor—which encapsulates in a relativistically invariant manner 
the ‘non-point-like’ aspect of the hadronic state 77. 
The physical process is 


e` (k,s) +27 (p) — e~ (k',s') +27 (p’) (8.128) 


which we represent, in general, by figure 8.7. To lowest order in a, the amplitude is repre- 
sented diagrammatically by a generalization of figure 8.5, shown in figure 8.8, in which the 
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FIGURE 8.7 
em? scattering amplitude. 


point-like ssy vertex is replaced by the may ‘blob’, which signifies all the unknown strong 
interaction corrections. 


8.4.1 e scattering from a charge distribution 


It is helpful to begin the discussion by returning to e~ Coulomb scattering again, but this 
time let us consider the case in which the potential A°(a) corresponds, not to a point charge, 
but to a spread-out charge density p(x). Then A? (æ) satisfies Poisson’s equation 


V? A? (æ) = —Zep(x). (8.129) 


Note that if A? (x) = Ze/4r|æ| as in (8.13) then p(x) = (x) (see appendix G) and we 
recover the point-like source. The calculation of the Coulomb matrix element will proceed 
as before, except that now we require, at equation (8.43), the Fourier transform 


A0(q) = fetta (eae (8.130) 


where q = k — k’. To evaluate (8.130), note first that from the definition of A? (x), we can 
write 


fedevia'e) ae = -Ze | 4% p(a) a 
= —ZeF(q) (8.131) 


FIGURE 8.8 
One-photon exchange amplitude in e~7* scattering, including hadronic corrections at the 
amy vertex. 
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where the (static) form factor F(q) has been introduced, the Fourier transform of p(x), 
satisfying 


F(0) = foeta =h (8.132) 


Condition (8.132) simply means that the total charge is Ze. The left-hand side of (8.131) 
can be transformed by two (three-dimensional) partial integrations to give 


[(vie#®) 4%) a = -q? f “47 A%(e) dz. (8.133) 


Using this result in (8.131), we find 


Â (q) = Ze. (8.134) 


Thus referring to equation (8.44) for example, the net result of the non-point-like charge 
distribution is to multiply the ‘point-like’ amplitude Ze?/q? by the form factor F(q) which 
in this simple static case has the interpretation of the Fourier transform of the charge distri- 
bution. So, for this (infinitely heavy 7+ case), the ‘blob’ in figure 8.8 would be represented 
by F(q). 

To gain some idea of what F(q?) might look like, consider a simple exponential shape 
for p(x) : 

1 


p(x) = Gaye (8.135) 


which has been normalized according to (8.132). Then F(q?) is (problem 8.13) 


1 
2 

F(q°)= (gat +1)?’ (8.136) 
We see that F(q?) decreases smoothly away from unity at q? = 0. The characteristic scale 
of the fall-off in |g| is ~ a~ from (8.136), which, as expected from Fourier transform 
theory, is the reciprocal of the spatial fall-off, which is approximately a from (8.135); the 
root mean square radius of the distribution (8.135) is actually 12a (problem 8.13). Since 
q? = 4k’ sin? 0 /2, a larger q? means a larger 0: hence, in scattering from an extended charge 
distribution, the cross section at larger angles will drop below the point-like value. This is, 
of course, how Rutherford deduced that the nucleus had a spatial extension. 

We now seek a Lorentz-invariant generalization of this static form factor. In the ab- 
sence of a fundamental understanding of the 7* structure coming from QCD, we shall 
rely on Lorentz invariance and electromagnetic current conservation (one aspect of gauge 
invariance) to restrict the general form of the may vertex shown in figure 8.8. The use of 
invariance arguments to place restrictions on the form of amplitudes is an extremely general 
and important tool, in the absence of a complete theory. 


8.4.2 Lorentz invariance 


First, consider Lorentz invariance. We seek to generalize the point-like ssy vertex (cf (8.98) 
and comment (1) after (8.99)) 


jh, (p,p) = (st, p'a s(0)ls", p) = elp + p°)“ (8.137) 


to j}, (p,p’), which will include strong interaction effects. Whatever these effects are, they 
cannot destroy the 4-vector character of the current. To construct the general form of 
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jË (p, p') therefore, we must first enumerate the independent momentum 4-vectors we have 
at our disposal to parametrize the 4-vector nature of the current. These are just 


p př ad q (8.138) 
subject to the condition 
p=pt+q. (8.139) 
There are two independent combinations; these we can choose to be the linear combinations 
(p+ p)u (8.140) 
and 
(pP — p)u = dp- (8.141) 


Both of these 4-vectors can, in general, parametrize the 4-vector nature of the electromag- 
netic current of a real pion. Moreover, they can be multiplied by an unknown scalar function 
of the available Lorentz scalar products for this process. Since 


pP = p° = M? (8.142) 


and 
Ê = 2M? — 2p- p' (8.143) 


there is only one independent scalar in the problem, which we may take to be q?, the 4- 
momentum transfer to the vertex. Thus, from Lorentz invariance, we are led to write the 
electromagnetic vertex of a pion in the form 


jk. (p,p') = (T, pjan (Olr, p) = e[F(q?)(p' + p)” + G(q")q"). (8.144) 


The functions F and G are called ‘form factors’. 

This is as far as Lorentz invariance can take us. To identify the pion form factor, we 
must consider our second symmetry principle, gauge invariance—in the form of current 
conservation. 


8.4.3 Current conservation 


The Maxwell equations (7.65) reduce, in the Lorentz gauge 
O, AY =0 (8.145) 


to the simple form 


AM = j” (8.146) 


and the gauge condition is consistent with the familiar current conservation condition 
Oj" = 0. (8.147) 
As we have seen in (8.100), the current conservation condition is equivalent to the condition 
qu (T7 (D')Jbm,n (0)|0* (p)) = 0 (8.148) 


on the pion electromagnetic vertex. 
In the case of the point-like s+ this is clearly satisfied since 


q:(p' +p) =0 (8.149) 
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with the aid of (8.142). In the general case we obtain the condition 
qulF (4°) (p + p)" + G(q")a"] = 0. (8.150) 


The first term vanishes as before, but q? Æ 0 in general, and we therefore conclude that 
current conservation implies that 
G(q*) = 0. (8.151) 


In other words, all the virtual strong interaction effects at the m+? vertex are described 
by one scalar function of the virtual photon’s squared 4-momentum: 


1 H 2 i H 
Ae ag E (8.152) 
point pion real pion 


F(q?) is the electromagnetic form factor of the pion, which generalizes the static form factor 
F(q?) of section 8.4.1. The pion electromagnetic vertex is then 


jh (p,p) = eF (7° )(p + p)". (8.153) 


The electric charge is defined to be the coupling at zero momentum transfer, so the form 
factor is normalized by the condition (cf (8.132)) 


F(0) =1. (8.154) 


To lowest order in a, the invariant amplitude for e~7* —> e~7* is therefore given by 
replacing j“, (p, p’) in (8.97) or (8.109) by j*, (p, p’): 


=i v npe 
iM,.-7+ = —ie(p + p’)*F((p' — p)*) (z te) [+ieu(k’, s')y uk, 5)]. (8.155) 
It is clear that the effect of the pion structure is simply to multiply the ‘no-structure’ cross 
section (8.122) by the square of the form factor, F(q? = (p’ — p)”). 

For ent — e727? in the CM frame we may take p = (E,p) and p' = (E,p’) with 
|p| = |p'| and E = (mz + p?)'/?. Then 


q = (p' — p)? = —4p? sin? 0/2 (8.156) 


as in section 8.1, where 0 is now the CM scattering angle between p and p’. Hence F(q?) 
can be probed for negative (space-like) values of q?, in the process e- 7+ — e~ 77+. As in the 
static case, we expect the form factor to fall off as —q? increases since, roughly speaking, it 
represents the amplitude for the target to remain intact when probed by the electromagnetic 
current. As —q? increases, the amplitudes of inelastic processes which involve the creation 
of extra particles become greater, and the elastic amplitude is correspondingly reduced. We 
shall consider inelastic scattering in the following chapter. 

Interestingly, F(q?) may also be measured at positive (time-like) q 
action ete” — ntr" as we now discuss. 


2 in the related re- 


8.5 The form factor in the time-like region: ete~ > ntr” and 
crossing symmetry 


The physical process is 


et (k1,81) +e (k,s) 3 a1 (p') + 7 (p1) (8.157) 
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FIGURE 8.9 
ete — ntr scattering amplitude. 


= + S 


FIGURE 8.10 
The amplitude of figure 8.9, with positive 4-momentum anti-particles replaced by negative 
4-momentum particles. 


FIGURE 8.11 
The amplitude of figure 8.10 redrawn so as to obtain a reaction in which the initial state 
has only ‘ingoing’ lines and the final state has only ‘outgoing’ lines. 


as shown in figure 8.9. We can use this as an instructive exercise in the Feynman interpre- 
tation of section 3.4.4. From that section, we know that the invariant amplitude for (8.157) 
is equal to minus the amplitude for a process in which the ingoing anti-particle e” with 
(kı, 81) becomes an outgoing particle e~ with (—k,,—s,), and the outgoing anti-particle 
ma with pı becomes an ingoing particle t+ with —p,. In this way the ‘physical’ (positive 
4-momentum) anti-particle states (eT and 77) are replaced by appropriate ‘unphysical’ 
(negative 4-momentum) particle states (e~ and 7+). These changes transform figure 8.9 to 
figure 8.10. 

If we now look at figure 8.10 ‘from the top downwards’ (instead of from left to right— 
remember that Feynman diagrams are not in coordinate space!), we see a process of e77* 
scattering, namely 


e (k,s) +2*(—p1) > e7 (—k1,—81) + 27 (p’). (8.158) 


But (8.158) is something we have already calculated! (Though we shall have to substitute 
a negative-energy spinor v for a positive energy one u.) In fact, let us redraw figure 8.10 as 
figure 8.11 to make it look more like figure 8.7. Then, to lowest order in a, the amplitude for 
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FIGURE 8.12 
One-photon exchange amplitude for the process of figure 8.11. 


figure 8.11 is shown in figure 8.12 (compare figure 8.8). To obtain the corresponding mathe- 
matical expression for the amplitude iM,+,.-_,+,-, we simply need to modify (8.155): (a) 
by inserting a minus sign; (b) by replacing p by —p; and k’ by —kı as in figure 8.12; and 
(c) by replacing t(k’, s’) by (ky, s1). This yields the invariant amplitude for figure 8.12 as 


. : =i v 
Mete artn = —ie(—pı +p)“ F((pı +p')’) (a) 


x [-ied(k1, sı)” u(k, s)| (8.159) 


which is represented by the Feynman diagram of figure 8.13 for the original process of 
(8.157) and figure 8.9. 

In the language introduced in section 6.3.3, figure 8.13 is an ‘s-channel process’ (s = 
(k + ki)? = (pı + p’)?) for ete” —> ntr, whereas figure 8.8 is a ‘t-channel process’ 
(t = (k—k’)? = (p' —p)”) for e~a* > e` nt. However, we have seen that the amplitude for 
the ete~ — mtr” process can be obtained from the e~7* — e~7* amplitude by making 
the replacement k’ + —ki,p — —pı (together with the sign, and ŭ — v). Under these 
replacements of the 4-momenta, the variable t = (k — k’)? = (p — p')? of figure 8.8 becomes 
the variable s = (k + kı)? = (pı + p’)? of figure 8.13. In particular, as is evident in the 
formula (8.159), the same form factor F is a function of the invariant s = (pı + p’)? in 
process (8.157), and of t = (p—p’)? in process (8.128). The interesting thing is that whereas 


FIGURE 8.13 
One-photon exchange amplitude for the process of figure 8.9. 
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(as we have seen) ‘t’ is negative in process (8.128), ‘s’ for process (8.157) is the square of the 
total CM energy, which is > 4M? where M is the pion mass (2M is the threshold energy for 
the reaction to proceed in the CM system). Thus the form factor can be probed at negative 
values of its argument in the process e-7* + ea, and at positive values > 4M? in the 
process ete” — ata. In the next chapter (section 9.5) we shall see how, in the latter 
process, meson resonances dominate F(s). 

The procedure whereby an ingoing/outgoing anti-particle is switched to an outgo- 
ing/ingoing particle is called ‘crossing’ (the state is being ‘crossed’ from one side of the 
reaction to the other). By an extension of this language, ete~ — ntr" is called the crossed 
process relative to e-a* — e-a* (or vice versa). The fact that the amplitude for a given 
process and its ‘crossed’ analogue are directly related via the Feynman interpretation (or by 
quantum field theory!) is called ‘crossing symmetry’. In the example studied here, what is 
an s-channel process for one reaction becomes a t-channel process for the crossed reaction. 
Essentially, little more is involved than looking in the one case from left to right and, in the 
other, from top to bottom! 


SST 
8.6 Electron Compton scattering 
8.6.1 The lowest-order amplitudes 


We proceed to explore some other elementary electromagnetic processes. So far we have 
not considered a reaction with external photons, so let us now discuss electron Compton 


scattering 
qlk, A) +e (p, 8) > y(k', X) +67 (p', s") (8.160) 


where the A’s stand for the polarizations of the photons. Since only the y’s and e~’s are 
involved, the interaction Hamiltonian is simply He. and it is clear that this must act at 
least twice in the reaction (8.160). By following the method of section 6.3.2 one can formally 
derive what we are here going to assume is by now obvious, which is that to order e? (i.e. a 
in the amplitude) there are two contributing Feynman graphs, as shown in figures 8.14(a) 
and (b). The first is an s-channel process, the second a u-channel process. We already know 
the factors for the vertices and for the external electron lines; we need to know the factors 
for the internal electron lines (propagators) and the external photon lines. The fermion 
propagator was given in section 7.2 and is i/(¢— m + ie) for a line carrying 4-momentum q. 
As regards the ‘external-y’ factor, this will arise from contractions of the form (cf (6.90)) 


V2Ep (Ola(k’, X’) A“ (a1)|0) = e*(k’, Nel ™ (8.161) 


where the evaluation of the vev has used the mode expansion (7.104) and the commutation 
relations (7.108), as usual; note, however, that only transverse polarization states (A, A’ = 1 
and 2) enter in the external (physical) photon lines in figures 8.14(a) and (b). 

Thus we add two more rules to the (i)—(v) of section 8.3.1: 


(vi) For an incoming photon of 4-momentum k and polarization À, there is a factor 
e” (k, A); for an outgoing one, é*(k’, ’). 


(vii) For an internal spin-5 particle carrying 4-momentum q, there is a factor i/(¢ — 
m + ie) = i(¢ + m)/(q? — m? + ie). 
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(a) 


FIGURE 8.14 
O(e?) contributions to electron Compton scattering. 


The invariant amplitude M,e- corresponding to figures 8.14(a) and (b) is therefore 


My- = -ee (k', Vente Aa s EE 


-EE Neule VA e EEE ulos) 


Hu(p, 5) 


(8.162) 


To get the spinor factors in expression such as these, the rule is to start at the ingoing 
fermion line (‘u(p,s)’) and follow the line through until the end, inserting vertices and 
propagators in the right order, until you reach the outgoing state (‘a’). Note that here 
s = (p + k)? and u = (p—k’)?. 


8.6.2 Gauge invariance 


We learned in section 7.3.1 that the gauge symmetry (A“ + A” — ð” x) of electromagnetism, 
as applied to real free photons, implied that any photon polarization vector e” (k, A) could 
be replaced by 

EC (kA) = e (k, A) + BRE (8.163) 


where ĝ is an arbitrary constant. Such a transformation amounted to a change of gauge, 
always remaining within the Lorentz gauge for which e- k = ¢' - k = 0. Thus our amplitude 
(8.162) must be unchanged if we make either or both the replacements € > e€ + Bk and 
e* — e* + Bk’ indicated in (8.163). This means that if in (8.162) we replace either or both of 
€y(k, A) and &(k', A’) by k, and k}, respectively, the result has to be zero. This can indeed 
be verified (problem 8.14). 

A similar result is generally true and very important. Consider a process, shown in 
figure 8.15, involving a photon of momentum k”, whose polarization state is described by 
the vector e”. The amplitude A, for this process must be linear in the photon polarization 


vector and thus we may write 
A, =e"T,, (8.164) 


where T, depends on the particular process under consideration. With the Lorentz choice 


for e” we have 
k-e=0. (8.165) 


But gauge invariance implies that if we replace e” in (8.164) by k” we must get zero: 


kT, = 0. (8.166) 


This important condition on T, is known as a Ward identity (Ward 1950). 
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FIGURE 8.15 
General one-photon process. 


8.6.3 The Compton cross section 


The calculation of the cross section is of considerable interest, since it is required when 
considering lowest-order QCD corrections to the parton model for deep inelastic scattering 
of leptons from nucleons (see the following chapter and volume 2). We must average |Mze- |? 
over initial electron spins and photon polarizations and sum over final ones. Consider first 
the s-channel process of figure 8.14(a), with amplitude MEL, For this contribution we 
must evaluate 


et 


eae > Tenet’ (pt h+ mtuu (p+ b+ m) (8.167) 


A,r‘,8,8/ 

where we have shortened the notation in an obvious way and introduced the invariant 

Mandelstam variable (section 6.3.3) s = (p + k)?. We know how to write the spin sums in 

a convenient form, as a trace. We need to find a similar trick for the polarization sum. 
Consider the general ‘one-photon’ process shown in figure 8.15, with amplitude A, = 

e (k, A)T,,, where e” (k, 1) = (0,1,0,0) and e“(k, 2) = (0,0,1,0), and k” = (k,0,0,k). Then 

the required polarization sum would be 


XO (k, ATE” (k, ANTE = |i? + T}. (8.168) 
A=1,2 
However, we also know that kT), = 0 from the Ward identity (8.166). This tells us that 
kTo — kT; = 0 (8.169) 
and hence To = T3. It follows that we may write (8.168) as 
So Hk, Ae” (k, ATT = [P + RP? + [D3]? T]? (8.170) 
d=1,2 


= =g” TT. (8.171) 


Thus we may replace the non-covariant expression ‘)?\_, 2 € (k, A)e”* (k, A)’ by the covariant 
one ‘—g"”’. The reader may here recall equation (7.118), where the ‘pseudo-completeness’ 
relation involving all four e’s was given, a similarly covariant expression. This relation 
corresponds exactly to the right-hand side of (8.170), which (in these terms) shows that the 
A = 0 state enters with negative norm. 

Using this result, the term (8.167) becomes 


4 

é€ a 7 

Tec? ÒO TY +B my aiyy (p + E+ my 
et 


= ge maj BI + m) (p+ K+ m)” (p+ m) alp + K+ m) 


(8.172) 
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where, in the second step, we have moved the y, to the front of the trace, using (8.71). 
Expression (8.172) involves the trace of eight y matrices, which is beyond the power of the 
machinery given so far. However, it simplifies greatly if we neglect the electron mass—that 
is, if we are interested in the high-energy limit, as we shall be in parton model applications. 
In that case, (8.172) becomes 


et 
qa Bhp + Bp + E) (8.173) 


which we can simplify using the result (J.3) to 


et 

a Tlp (p + Kpy + K) (8.174) 
E ETIP hpi using p” = p° =0 (8.175) 
= 7 -2(p'-k)(p-k) using (8.76) and k* = 0 (8.176) 
= —2efu/s (8.177) 


where u = (p — k’)?. Problem 8.15 finishes the calculation, with the result that the spin- 
averaged squared amplitude is 


>D Myf = 204 (+43). (8.178) 


s u 
s,8', A,A! 

The cross section in the CMS is then (cf (6.129)) 

da = Qn2e* (—u s = ma? [(—-u s (8.179) 
d(cos0) 64r?s\ s wu s s u 

For parton model calculations, what is actually required is the analogous quantity cal- 
culated for the case in which the initial photon is virtual (see section 9.2). However, the 
discussion of section 7.3.2 shows that we may still use the polarization sum (8.170). A dif- 
ference will arise in passing from (8.175) to (8.176) where we must remember that k? Æ 0. 


Since k? will be space-like, we put k? = —Q? and find (problem 8.16) that the spin-averaged 
squared amplitude for the virtual Compton process 


y*(k? = -Q?)+e° op yte7 (8.180) 
is given by 
2Q7t 
2e4 (: oo ) (8.181) 
S u SU 
ey 


8.7 Electron muon elastic scattering 


Our final group of electrodynamic processes are ones in which two fermions interact elec- 
tromagnetically. In this section we discuss the scattering of two point-like fermions (i.e. 
leptons); in the following one we look at the change (analogous to those for the 7? as 
compared to the st) necessitated when one fermion is a hadron, for example the proton. 
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FIGURE 8.16 
e7 u` scattering amplitude. 


We shall consider e~ p~ elastic scattering: our notation is indicated in figure 8.16. In 
the lowest order of perturbation theory—the one-photon exchange approximation—we can 
draw the relevant Feynman graph for this process. This is shown in figure 8.17. All the 
elements for the graph have been met before and so we can immediately write down the 
invariant amplitude which now depends on four spin labels: 


Me-p- (1, 8:7", 8") = eū(k', s')yuulk, s)(g"/@ et(p',r’)wulp,r). (8.182) 


Although experiments with polarized leptons are not uncommon, we shall only be con- 
cerned with the unpolarized cross section 


dav} Y Mep (rs; s"). (8.183) 


We perform the same manipulations as in our e~st example and the cross section reduces 
to a factorized form involving two traces: 


2. (Mei (r, s;r',s')|? — (5) EG myk + ma} 
x{z Tr[(p + M)" + My") (8.184) 
= (e /P PLu M” (8.185) 


see (8.119)): 


q /2)gu] (8.186) 


where L,» is the ‘electron tensor’ calculated before 


( 
( 


Lu = 2[k ky + kyky + 


FIGURE 8.17 
One-photon exchange amplitude in e~ u~ scattering. 
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but now M#” is the appropriate tensor for the muon coupling, with the same structure as 
Liv: 
M" = Ip!" p” + pl’ pt + (a° /2)g"]. (8.187) 


To evaluate the cross section we must perform the ‘contraction’ Lay M#”. A useful trick 
to simplify this calculation is to use current conservation for the electron tensor L y. For 
the electron transition current, the electromagnetic current conservation condition is (cf 
equation (8.100)) 

q" [u(k', s’)y,u(k, s)| = 0 (8.188) 


i.e. independent of the particular spin projections s and s’. Since L,,, is the product of two 
such currents, summed and averaged over polarizations, current conservation implies the 
conditions 

hy = q Ly =90 (8.189) 


which can be explicitly checked using our result for L v. The usefulness of this result is 
that in the contraction L,,M'” we can replace p' in M"” by (p+ q) and then drop all the 
terms involving q’s, i.e. 

LM” = IgM (8.190) 


where 
Mtg = 2[2p"p” + (q?/2)g?”]. (8.191) 


The calculation of the cross section is now straightforward. In the ‘laboratory’ system, 
defined (unrealistically) by the target muon at rest 


p” = (M,0,0, 0) (8.192) 


with M now the muon mass, the result is (problem 8.17(a)) 


do (do q? tan? (0/2) 
dQ (5). (1 2M? ) ees) 


Note the following points: 


Comment (a) 


The ‘no-structure’ cross section (8.122) for e~st scattering now appears modified by an 
additional term proportional to tan?(/2). This is due to the spin- nature of the muon 
which gives rise to scattering from both the charge and the magnetic moment of the muon. 


Comment (b) 


In the kinematics the electron mass has been neglected, which is usually a good approx- 
imation at high energies. We should add a word of explanation for the ‘laboratory’ cross 
sections we have calculated, with the target muon unrealistically at rest. The form of the 
cross section, (da/dQ),;, and of the cross section for the scattering of two Dirac point 
particles, will be of great value in our discussion of the quark parton model in the next 
chapter. 


Comment (c) 


The crossed version of this process, namely ete~ — utu”, is a very important monitoring 
reaction for electron—positron colliding beam machines. It is also basic to a discussion of the 
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FIGURE 8.18 
One-photon exchange amplitude in e~p scattering, including hadronic corrections at the 
ppy vertex. 


predictions of the quark parton model for ete~ — hadrons, which will be discussed in section 
9.5. An instructive calculation similar to this one leads to the result (see problem 8.18) 


do a? 

—~ = — (1 + cos? 0 8.194 

wi ) (8.194) 
where all variables are defined in the e+e~ CM frame, q? is now the square of the CM 
energy, and the electron and muon masses have been neglected. The total cross section, in 
the one-photon exchange approximation, is then 


o = 4ra? /3q° = 86.8 nb/q?(GeV?), (8.195) 


where we have made use of equation (B.18) of appendix B. 

The energy dependence of this cross section (x 1/q?) is important, and can be un- 
derstood by a simple dimensional argument. A cross section has dimensions of a squared 
length, or in natural units (appendix B) inverse squared mass or energy. Here both colliding 
particles are taken to be pointlike, with no form factors involving a length parameter, and 
the mediating quantum is massless. At energies much larger than the lepton masses, the 
only available dimensional quantity is the CM energy. It follows that the cross section must 
be inversely proportional to the square of the CM energy, in this ‘pointlike, high energy’ 
limit. By the same token, deviations from this behaviour would be evidence for non-pointlike 
leptonic structure. 


8.8 Electron—proton elastic scattering and nucleon form factors 


In the one-photon exchange approximation, the Feynman diagram for elastic electron— 
proton scattering may be drawn as in figure 8.18, where the ‘blob’ at the ppy vertex signifies 
the expected modification of the point coupling due to strong interactions. The structure 
of the proton vertex can be analysed using symmetry principles in the same way as for the 
pion vertex. The presence of Dirac spinors and y-matrices makes this a somewhat involved 
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procedure and problem 8.19 is an example of the type of complication that arises. Full de- 
tails of such an analysis can be found in Bernstein (1968), for example. Here, however, we 
shall proceed in a different way, in order to generalize more easily to inelastic scattering in 
the following chapter. We focus directly on the ‘proton tensor’ BY”, which is the product of 
two proton current matrix elements, summed and averaged over polarizations, as is required 
in the calculation of the unpolarized cross section (cf (8.57)): 


v 1 p ^y * 
BYU = I7 D (Pips lka (0) IDs B, 5) ( (p; p’, 8" |Jem,p(0) IP; P, 8))*- (8.196) 


s,s! 


We remarked in comment (a) after equation (8.193) that for e~ scattering from a point- 
like charged fermion an additional term in the cross section was present, corresponding to 
scattering from the target’s magnetic moment. Since a real proton is not a point particle, 
the virtual strong interaction effects will modify both the charge and the magnetic moment 
distribution. Hence we may expect that two form factors will be needed to describe the 
deviation from point-like behaviour. This is in fact the case, as we now show using symmetry 
arguments similar to those of section 8.4. 


8.8.1 Lorentz invariance 


BY” must retain its tensor character: this must be made up using the available 4-vectors 
and tensors at our disposal. For the spin-averaged case we have only 


p, q, and Juv (8.197) 


since p' = p + q. The antisymmetric tensor €,,,9g (see appendix J) must actually be ruled 
out using parity invariance: the tensor BY” is not a pseudo tensor since i is a vector. 
It is helpful to remember that €, og is the generalization of €;;, in three dimensions, and 
that the vector product of two 3-vectors—a pseudo vector—may be written 


8.8.2 Current conservation 
For a real proton, current conservation gives the condition (cf (8.148)) 
dup; p’, 8’ 4u.p(0)|pip,s) = 0 (8.199) 
which translates to the conditions (cf (8.189)) 
quB” = gB =0 (8.200) 


on the tensor BY”. 
There are only two possible tensors we can make that satisfy both these requirements. 
One involves p and is constructed to be orthogonal to q. We introduce a vector 


Pu = Pu + dp (8.201) 


and require 
q: P=0. (8.202) 


Hence we find 
Bu = Pu — (p+ a/9°) ap (8.203) 
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and thus the tensor 


PD = [p — (p: a/a°)a" Ip” — (p: a/r] (8.204) 
satisfies all our requirements. The second tensor must involve g”” and may be chosen to be 
=g” ere (8.205) 


which again satisfies our conditions. Thus from invariance arguments alone, the tensor BY” 
for the proton vertex may be parametrized by these two tensors, each multiplied by an 
unknown function of q?. If we define 


B” = 4A(q?)[p" — (p- a/0?)a" — (p 4/0)" 
+2M? B(q’)(—g"” + gq” /q°) (8.206) 


the cross section in the laboratory frame is (problem 8.19) 


= = (35). [A + Btan?(0/2)]. (8.207) 
Formula (8.207) implies that a plot of (da/dQ)/(do/dQ), versus tan? 0/2, at fixed q?, will 
be a straight line with slope B and intercept A. 

The functions A and B may be related to the ‘charge’ and ‘magnetic’ form factors of 
the proton. The Dirac ‘charge’ and Pauli ‘anomalous magnetic moment’ form factors, Fi 
and F2, respectively, are defined by 


(p; p’, s'la » (0) IPs p, s) 


s 2 
= (+e)a(p’, s") PFP) + ED gig, u(p, 5) (8.208) 
with the normalization 
Fı(0) = 1 (8.209) 
Fa(0) = 1 (8.210) 


and the magnetic moment of the proton is not one (nuclear) magneton, as for an electron 
or muon (neglecting higher-order corrections), but rather pp = 1 + « with « = 1.79. Prob- 
lem 8.20 shows that the ty"u piece in (8.208) can be rewritten in terms of u(p + p’)#u/2M 
and uio””qyu/2M. The first of these is analogous to the interaction of a charged spin-0 
particle. As regards the second, we note that a” is just 


ot = zih“, y] (8.211) 


which reduces to the Pauli spin matrices for the space-like components 
7 k 
g" = K w) (8.212) 


with our representation of y-matrices (o” is a 4 x 4 matrix, o* is 2 x 2, and i, j and k are 
in cyclic order). The second term in this ‘Gordon decomposition’ of uy”u thus corresponds 
to an interaction via the spin magnetic moment—with, in fact, g = 2. Thus the addition of 
the « term in (8.208) corresponds to an ‘anomalous’ magnetic moment piece. In terms of 
Fı and Fə one can show that 


A = Fe+rk’F3 (8.213) 
B 27(Fi + KF)” (8.214) 
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where 
T = —q’/4M?. (8.215) 


The point-like cross section (8.193) is recovered from (8.207) by setting Fı = 1 and « = 0 
in (8.213) and (8.214). 

The functions Fı and F> are, in turn, usually expressed in terms of the electric and 
magnetic form factors Gg and Gy, defined by Gg = F, —TKF2, Gu = Fi + KF2. We then 
find A = (G2, + 7G2,)/(1 +7) and B = 27G?,. The cross section formula (8.207), written 
in terms of Gg and Gym, is known as the ‘Rosenbluth’ cross section. 

Experimental data indicate that the g?-dependences of Gg and Gy for the proton, and 
of Gm for the neutron, are all quite well represented by the function F'(q?) of (8.136) with 
q? replaced by —q? and with a ~ 0.84 GeV~", at least for values of —q? up to a few GeV? 
(see, for example, Perkins 1987, section 6.5). 

Before we leave elastic scattering it is helpful to look in some more detail at the kine- 
matics. It will be sufficient to consider the ‘point-like’ case, which we shall call e~ u”, for 
definiteness. Energy and momentum conservation at the u™ vertex gives the condition 


pt+q=p (8.216) 
with the mass-shell conditions (M is the u™ mass) 
pP =p” = M. (8.217) 
Hence for elastic scattering we have the relation 
2p-q=-@. (8.218) 


It is conventional to relate these invariants to the corresponding laboratory frame (p” = 
(M,0)) expressions. Neglecting the electron mass so that? 


k = |k|=w (8.219) 
k = |k'| =o! (8.220) 
we have 
q? = —2kk' (1 — cos 0) = —4kk' sin? (0/2) (8.221) 
and 
p-q=M(k—-k') = Mv (8.222) 


where v is the energy transfer q? in this frame. To avoid unnecessary minus signs, it is 
convenient to define 


Q? = -@ = 4kk’ sin? (0/2) (8.223) 
and the elastic scattering relation between p -q and q? reads 
v = Q?/2M (8.224) 
i ke 1 
(8.225) 


k 1+ (2k/M)sin?(0/2) 
Remembering, therefore, that for elastic scattering k’ and @ are not independent variables, 
we can perform a change of variables (see appendix K) in the laboratory frame 


dQ. = 2r d(cos 0) = (T/k?) dQ? (8.226) 


2 As after equation (8.126), note again that in the present context ‘k’ and ‘k’’ are not 4-vectors but the 
moduli of 3-vectors. 


Problems 217 


Q 
u- 


FIGURE 8.19 

Physical regions for e7 p scattering in the Q?, v variables: A, kinematically forbidden 
region; B, line of elastic scattering (Q? = 2Mv); C, lines of resonance electroproduction; D, 
photoproduction; E, deep inelastic region (Q? and v large). 


and write the differential cross section for e~ y+ scattering as 


dao Ta 


IQ? = nOA = [cos? (0/2) + 27 sin?(0/2)]. (8.227) 


For elastic scattering v is not independent of Q? but we may formally write this as a double- 
differential cross section by inserting the 6-function to ensure this condition is satisfied: 


d?o Ta 


I ~ Te an"(0/d) = | cos*(/2) + (2>) sin?(0/2)| ô (-- =) .| (8.228) 


This is the cross section for the scattering of an electron from a point-like fermion target of 
charge e and mass M. 

It is illuminating to plot out the physically allowed regions of Q? and v (figure 8.19). 
Elastic e~p scattering corresponds to the line Q? = 2Mv. Resonance production e~p > 
e~N* with p’? = M” corresponds to lines parallel to the elastic line, shifted to the right by 
M’ — M? since 


2Mv = Q? + M”? — M?. (8.229) 


Experiments with real photons, Q? = 0, correspond to exploring along the v-axis. In the 
next chapter we switch our attention to so-called deep inelastic electron scattering—the 
region of large Q? and large v. 


Problems 


8.1 Consider a matrix element of the form 


M = f ae f attat areias, 
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Assuming the integration is over all space-time and that 


A? +0 as t => +oo 


and 
|A| > 0 as |x| — o0 


use integration by parts to show 


(a) [ atetirrayaterin= = (ip) | dtetPes Aberin 


(b) fèr et iPr ty. Ae iPr — tipp- ( [a tee Pe 
Hence show that 


/ dg | dt etre (09, AF + A’, Je" 


= —i(pr T pi) [ae fat etips Abe irit, 


8.2 Verify equation (8.27). 
8.3 Evaluate (8.31) and interpret the result physically (i.e. compare it with (8.27)). 
8.4 


(a) Using the u-spinors normalized as in (3.73), the ¢'? of (8.47), and the result for 
o - Ao - B from problem 3.4(b), show that 


tie ef a ae k'-k iptta -k’ x kg! 
ul Wa! = ules =1) = (Em) fit (E +m)? 


(b) For any vector A = (Al, A?, A3), show that ¢''o - Ad! = A’. Find similar 
expressions for dtto - Ad?, d?*a - Ad!, oto - Ad?. 


(c) Show that the S' of (8.46) is equal to 


_ 2 kick | (k' x k)? 
S=(E+m) ff a + eo 


(d) Using cos@ = k- k’/(|k||k’|), |k| = |k'| and v = |k|/E, show that 
S = (2E)?(1 — v’ sin? 0/2). 


8.5 Verify equation (8.55). 

8.6 Check that y°y#t7° = q”. 

8.7 Verify equation (8.79) for the lepton tensor L'”. 

8.8 Evaluate L°° as in equation (8.80). 

8.9 Verify equation (8.87). 

8.10 Verify equation (8.96) for the e~st —> e~st amplitude to O(e?). 
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8.11 Check that both the scalar and the spinor current matrix elements (8.27) and (8.55), 
satisfy 0 j” (x) = 0. 

8.12 Verify equation (8.120). 

8.13 Verify equation (8.136) for the Fourier transform of p(x) given by (8.135). Show that 
the mean square radius of the distribution (8.135) is 12a?. 

8.14 Check the gauge invariance of M,e- given by (8.162), by showing that if €, is replaced 
by ky, or e% by ki, the result is zero. 


8.15 


(a) The spin-averaged squared amplitude for lowest-order electron Compton scatter- 
ing contains the interference term 


(s) (u)* 
Mate 
A,A‘,8,8! 
where (s) and (u) refer to the s- and u-channel processes of figures 8.14(a) and 
(b) respectively. Obtain an expression analogous to (8.172) for this term, and 
prove that it is, in fact, zero. [Hint: work in the massless limit, and use relations 
(J.4) and (J.5).] 
(b) Explain why the term 
(u) yp 4(u)* 
D MO a 
Nya says! 


is given by (8.177) with s and u interchanged. 


8.16 Recalculate the interference term of problem 8.16(a) for the case k? = —Q? (but with 
k’? = p? = p°? = 0), and hence verify (8.181). 


8.17 


(a) Derive an expression for the spin-averaged differential cross section for lowest- 
order e7 u” scattering in the laboratory frame, defined by p” = (7,0) where M 
is now the muon mass, and show that it may be written in the form 

d d 

5 = (Ga) I- (P/2M°) tan?(o/2) 


dQ 
where the ‘no-structure’ cross section is that of e~s* scattering (appendix K) and 
the electron mass has been neglected. 


(b) Neglecting all masses, evaluate the spin-averaged expression (8.184) in terms of 
s,t and u and use the result 


do 1 1 
dt = 16782 4 5 |Me- u- (r,s; r’, s)|? 


r,r’ 38,8! 


to show that the e~ u~ cross section may be written in the form 


do 4na?1 u? 
= 1+ 5 
dt t? 2 s? 
Show also that by introducing the variable y, defined in terms of laboratory 


variables by y = (k — k’)/k, this reduces to the result 


do 4ra? 1 
dy £ 2 
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(a) Total cross sections for ete~ — tp and ete™ — 7*77; (b) differential cross section 
for ete” + pty. (From D H Perkins 2000 Introduction to High Energy Physics 4th edn, 
courtesy Cambridge University Press.) 


8.18 Consider the process ete~ — pty in the CM frame. 
(a) Draw the lowest-order Feynman diagram and write down the corresponding am- 
plitude. 


(b) Show that the spin-averaged squared matrix element has the form 
IM |? = — Le) ww E(u)” 


where q? is the square of the total CM energy, and L(e) depends on the e~ and 
e+ momenta and L(j:) on those of the pt, p-. 


(c) Evaluate the traces and the tensor contraction (neglecting lepton masses): (i) 
directly, using the trace theorems and (ii) by using crossing symmetry and the 
results of section 8.7 for e~ ~~ scattering. Hence show that 


|M|? = (4ra)? (1 + cos? 6) 
where @ is the CM scattering angle, and that the CM differential cross section is 


do a? 5 
da = ae + cos 0). 


(d) Hence show that the total cross section is (see equation (B.18) of appendix B) 
o = 4ra? /3q? = 86.8 nb/q?(GeV?). 


Figure 8.20 shows data (a) for ø in ete~ > pty and ete” — rtr and (b) 
for the angular distribution in ete~ > utu. Note that s = q?. The data in 
(a) agree well with the prediction of part (d). The broken curve in figure 8.20(b) 
shows the pure QED prediction of part (c) for gz, 


It is clear that, while the distribution has the general 1+ cos? @ form as predicted, 
there is a small but definite forward—backward asymmetry. This arises because, 
in addition to the y-exchange amplitude there is also a Z°-exchange amplitude 
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(see section 22.3 of volume 2) which we have neglected. Such asymmetries are an 
important test of the electroweak theory. They are too small to be visible in the 
total cross sections in (a). 


8.19 Verify equation (8.207). [Hint: as in equation (8.191) the terms in q” and q” in BH” 
may be neglected because of the conditions (8.189).] 
8.20 Starting from the expression 


T au(p) 
U\p tom WU P 


where q = p' — p and o” = žij”, y’], use the Dirac equation and properties of y-matrices 
to prove the ‘Gordon decomposition’ of the current 


u(p')y"u(p) = u(p') (£ uy + ioe) u(p) 


9 


Deep Inelastic Electron—Nucleon Scattering and the 
Parton Model 


We have obtained the rules for doing calculations of simple processes in quantum electrody- 
namics for particles of spin-0 and spin-4, and many explicit examples have been considered. 
In this chapter we build on these results to give an (admittedly brief) introduction to a topic 
of central importance in particle physics, the structure of hadrons as revealed by deep in- 
elastic scattering experiments (the equally important neutrino scattering experiments will 
be discussed in volume 2). We do this partly because the necessary calculations involve 
straightforward, illustrative and eminently practical applications of the rules already ob- 
tained, but, more particularly, because it is from a comparison of these calculations with 
experiment that compelling evidence was obtained for the existence of the point-like con- 
stituents of hadrons—quarks and gluons—the interactions of which are described by QCD. 


9.1 Inelastic electron—proton scattering: kinematics and structure 
functions 


At large momentum transfers there is very little elastic scattering: inelastic scattering, in 
which there is more than just the electron and proton in the final state, is much more 
probable. The simplest inelastic cross section to measure is the so-called inclusive cross 
section, for which only the final electron is observed. This is therefore a sum over the 
cross sections for all the possible hadronic final states: no attempt is made to select any 
particular state from the hadronic debris created at the proton vertex. This process may be 
represented by the diagram of figure 9.1, assuming that the one-photon exchange amplitude 
dominates. The ‘blob’ at the proton vertex indicates our ignorance of the detailed structure: 
X indicates a sum over all possible hadronic final states. However, the assumption of one- 
photon exchange, which is known experimentally to be a very good approximation, means 
that, as in our previous examples (cf (8.118) and (8.185)), the cross section must factorize 
into a leptonic tensor contracted with a tensor describing the hadron vertex: 


do ~ LyyW*”" (q, p). (9.1) 
The lepton vertex is well described by QED and takes the same form as before: 
Luv = 2k ky + kyky + (27/2) gu]. (9.2) 


For the hadron tensor, however, we expect strong interactions to play an important role 
and we must deduce its general structure by our powerful invariance arguments. We will 
only consider unpolarized scattering and therefore perform an average over the initial proton 
spins. The sum over final states, X, includes all possible quantum numbers for each hadronic 
state with total momentum p’. For an inclusive cross section, the final phase space involves 
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X,p' 


(Unobserved hadrons) 


FIGURE 9.1 
Inelastic electron—proton scattering, in one-photon exchange approximation. 


only the scattered electron. Moreover, since we are not restricting the scattering process 
by picking out any specific state of X, the energy k’ and the scattering angle 0 of the final 
electron are now independent variables. In W#” (q, p) the sum over X includes the phase 
space for each hadronic state restricted by the usual 4-momentum-conserving 6-function 
to ensure that each state in X has momentum p’. Including some conventional factors, we 
define W"” (q, p) by (see problem 9.1) 


v 1 1 ^ ^y 
ew” (q, p) = XOY Din, slita p (0X; p) (X; p lén, p (0) IP; p, 8) 
s X 


4rM 2 

x (2)*64*(p +q- p’). (9.3) 
How do we parametrize the tensor structure of W#”? As usual, Lorentz invariance and 
current conservation come to our aid. There is one important difference compared with the 
elastic form factor case of section 8.8. For inclusive inelastic scattering there are now two 
independent scalar variables. The relation 


p =p+q (9.4) 


leads to r 
p“ = M?42p-q+¢ (9.5) 


where M is the proton mass. In this case, the invariant mass of the hadronic final state is 
a variable 


p? = W? (9.6) 
and is related to the other two scalar variables 
p-q= Mv (9.7) 
and (cf (8.223)) 
P =-Q (9.8) 
by the condition (cf (8.229)) 
2Mv = Q? + W° — M°. (9.9) 


Our invariance arguments lead us to the same tensor structure as for elastic electron- 
proton scattering, but now the functions A(q?), B(q?) are replaced by ‘structure functions’ 
which are functions of two variables, usually taken to be v and Q?. The conventional defi- 
nition of the proton structure functions W and W2 is 


W” (q, p) = (—g” + gg /P)W Q’, v) 


(9.10) 
+p” — (p - a/a")a"|[p” — (p: a/a?)q’|M-?W2(Q?, v). 


224 Deep Inelastic Electron—Nucleon Scattering 


Inserting the usual flux factor together with the final electron phase space leads to the 
following expression for the inclusive differential cross section for inelastic electron—proton 
scattering (see problem 9.1): 


Ara\? 1 d?k! 
= Lo w LE. 11 
do ( g ) A[(k- p)? zren A pW Du! (27)? (9.11) 


In terms of ‘laboratory’ variables, neglecting electron mass effects, this yields (problem 
9.2(a)) 


d?o a? 
= Wa cos? (0/2) + 2W; sin? (0/2)]. 9.12 
dQdk’ 4k2 sin'(6/2)| 2 COS ( / )+ 1 SN ( / )] ( ) 


Remembering now that cos and k’ are independent variables for inelastic scattering, we 
can change variables from cos @ and k’ to Q? and v, assuming azimuthal symmetry for the 
unpolarized cross section. We have 


Q? = 2kk’(1—cosé@) (9.13) 
v = k-k (9.14) 

so that (problem 9.2(b)) 

1 
d(cos 0) dk’ = apy ie” dv (9.15) 
and 
d?o T? 1 n 

vu r sin (0/2 Ek [W2 cos? (0/2) + 2W1 sin? (0/2)]. (9.16) 


Yet another choice of variables is sometimes used instead of these, namely the dimensionless 
variables 
xr = Q?/2Mv (9.17) 


whose significance we shall see in the next section, and 
y=u/k (9.18) 


which is the fractional energy transfer in the ‘laboratory’ frame. Note that relation (8.224) 
shows that x = 1 for elastic scattering. The Jacobian for the transformation from Q? and 
v to x and y is (see problem 9.2(b)) 


dQ? dv = 2M k’y dz dy. (9.19) 


We emphasize that the foregoing—in particular (9.3), (9.12), and (9.16)—is all completely 
general, given the initial one-photon approximation. The physics is all contained in the v 
and Q? dependence of the two structure functions W; and W2. 

A priori, one might expect W, and W2 to be complicated functions of v and Q?, reflect- 
ing the complexity of the inelastic scattering process. However, in 1969 Bjorken predicted 
that in the ‘deep inelastic region’—large v and Q?, but Q?/v finite—there should be a 
very simple behaviour. He predicted that the structure functions should scale, i.e. become 
functions not of Q? and v independently but only of their ratio Q?/v. It was the verification 
of approximate ‘Bjorken scaling’ that led to the development of the modern parton model. 
We therefore specialize our discussion of inelastic scattering to the deep inelastic region. 


Bjorken scaling and the parton model 225 


047 | 2 Gev2<Q7<18 GeV? be 
ar ost ti 
ar VIF a 0. f4 o fhe 4 
a y My ae i , 
B > 0.2 
a a 
X ue | hy 0.1 x=0.25 
| Nas 0 fae ed = oe at 
0.1 * 0 2 4 6 8 
- e 
a Q? (GeV/c?) 
0 rororo orar ar 


(a) (b) 


FIGURE 9.2 

Bjorken scaling: the structure function vWə (a) plotted against x for different Q? values 
(Attwood 1980, courtesy SLAC) and (b) plotted against Q? for the single x value, x = 0.25 
(Friedman and Kendall 1972). 


9.2  Bjorken scaling and the parton model 


From considerations based on the quark model current algebra of Gell-Mann (1962), Bjorken 
(1969) was led to propose the following ‘scaling hypothesis’: in the limit 
Q? > 00 
with z = Q?/2Mv fixed (9.20) 
V — œ 


the structure functions scale as 


MW,(Q?,v) > F(x) (9.21) 
vW2(Q?,v) > Fy(z2). (9.22) 


We must emphasize that the physical content of Bjorken’s hypothesis is that the functions 
F(x) and F(x) are finite’. 

Early experimental support for these predictions (figure 9.2) led initially to an exami- 
nation of the theoretical basis of Bjorken’s arguments and to the formulation of the simple 
intuitive picture provided by the parton model. Closer scrutiny of figure 9.2(a@) will encour- 
age the (correct) suspicion that, in fact, there is a small but significant spread in the data 
for any given xv value. In volume 2 we shall give an introduction to the way in which QCD 
corrections to the parton model lead to predictions for logarithmic (in Q?) violations of sim- 
ple scaling behaviour, which are in excellent agreement with experiment. These violations 
are particularly large at small values of x; for x greater than about 0.1, the structure func- 
tions are substantially independent of Q?, for a given x. The scaling predicted by Bjorken 
is certainly the most immediate gross feature of the data, and an understanding of it is of 
fundamental importance. 

How can the scaling be understood? Feynman, when asked to explain Bjorken’s ar- 
guments, gave an intuitive explanation in terms of elastic scattering from free point-like 


It is always possible to write W(Q?,v) = f(x, Q?), say, where f(x, Q?) will tend to some function F(x) 
as Q? > oo with z fixed. F(x) may, however, be zero, finite or infinite. The physics lies in the hypothesis 
that, in this limit, a finite part remains. 
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FIGURE 9.3 
Photon—parton interaction. 


constituents of the nucleon, which he dubbed ‘partons’ (Feynman 1969). The essence of the 
argument lies in the kinematics of elastic scattering of electrons by free point-like charged 
partons. We will therefore be able to use the results of the previous chapters to derive the 
parton model results. At high Q? and v it is intuitively reasonable (and in fact the basis 
for the light-cone and short-distance operator approach (Wilson 1969) to scaling) that the 
virtual photon is probing very short distances and time scales within the proton. In this 
situation, Feynman supposed that the photon interacts with small (point-like) constituents 
within the proton, which carry only a certain fraction f of the proton’s energy and momen- 
tum (figure 9.3). Over the short time scales involved in the transfer of a large amount of 
energy v, and at the short distances probed at large Q?, the struck constituents can per- 
haps be treated as effectively free and independent. (This is in sharp contrast to the case of 
elastic scattering, where the constituents are acting coherently.) We then have the idealized 
elastic scattering process shown in figure 9.4. It is the kinematics of the elastic scattering 
condition for the partons that leads directly to a relation between Q? and v and hence to 
the observed scaling behaviour. The original discussion of the parton model took place in 
the infinite-momentum frame of the proton. While this has the merit that it eliminates the 
need for explicit statements about parton masses and so on, it also obscures the simple 
kinematic origin of the scaling. For this reason, at the expense of some theoretical niceties, 
we prefer to perform a direct calculation of electron—parton scattering in close analogy with 
our previous examples. 
We first show that the fraction f is none other than Bjorken’s variable x. For a parton 
of type i we write 
pi = fp (9.23) 


FIGURE 9.4 
Elastic electron—parton scattering. 
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FIGURE 9.5 
Structure function for quasi-elastic ed scattering, plotted against x (Attwood 1980, courtesy 
SLAC). 


and, roughly speaking?, we can imagine that the partons have mass 
mi x fM. (9.24) 


Then, exactly as in (8.216) and (8.217), energy and momentum conservation at the parton 
vertex, together with the assumption that the struck parton remains on-shell (as indicated 
by the fact that in figure 9.4 the partons are free), imply that 


(q+ fp)? =m (9.25) 
which, using (9.8), (8.222), and (9.24), gives 
f=Q' 2Mv =x. (9.26) 


Thus the fact that the nucleon structure functions do seem to depend (to a good approx- 
imation) only on the variable x is interpreted physically as showing that the scattering is 
dominated by the ‘quasi-free’ electron—parton process shown in figure 9.4. In section 11.5.3 
we shall see how the ‘asymptotic freedom’ property of QCD suggests a dynamical under- 
standing of this picture, as will be discussed further in chapter 15 of volume 2. We shall also 
see in section 15.6 how QCD corrections to the free parton model give observable violations 
of Bjorken scaling, providing tests of QCD. 

What sort of values for x do we expect? Consider an analogous situation—electron 
scattering from deuterium. Here the target (the deuteron) is undoubtedly composite, and 
its ‘partons’ are, to a first approximation, just the two nucleons. Since my ~ imp, we 
expect to see the value x ~ 4 (cf (9.24)) favoured; x = 1 here would correspond to elastic 
scattering from the deuteron. A peak at x ~ $ is indeed observed (figure 9.5) in quasi- 
elastic e~d scattering (the broadening of the peak is due to the fact that the constituent 


nucleons have some motion within the deuteron.) By ‘quasi-elastic’ here we mean that the 


2Explicit statements about parton transverse momenta and masses, such as those made in equa- 
tions (9.23) and (9.24), are unnecessary in a rigorous treatment, where such quantities can be shown to give 
rise to non-leading scaling behaviour (Sachrajda 1983). 
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incident electron scatters off ‘quasi-free’ nucleons, an approximation we expect to be good for 
incident energies significantly greater than the binding energy of the n and p in the deuteron 
(~2 MeV). What about the nucleon itself, then? A simple three-quark model would, on this 
analogy, lead us to expect a peak at x ~ $, but the data already shown (figure 9.2(a)) 
do not look much like that. Perhaps there is something else present too—which we shall 
uncover as our story proceeds. 

Certainly, it seems sensible to suppose that a nucleon contains at least some quarks 
(and also anti-quarks) of the type introduced in the simple composite models of the nucleon 
(section 1.2.2). If quarks are supposed to have spin-3, then the scattering of an electron 
from a quark or anti-quark—generically a charged parton—of type i, charge e; (in units of 
e) is just given by the ey scattering cross section (8.228), with obvious modifications: 


da" Ta? 1 E iO? 2.5 
dQ?dv 4k? sin4(8/2) kk! (« cos (8/2) + e; = (0/2)) 
ld ie (0.27 


This is to be compared with the general inclusive inelastic cross section formula written in 
terms of W, and W3: 


d?o Ta? 1 


IGN ~ ak sin*(0/) BR [W> cos? (0/2) + W12 sin? (0/2)]. (9.28) 


Thus the contribution to Wı and Wə from one parton of type i is immediately seen to be 


Q? 


i > 2 _ N2 
Wi = & Tag? OY Q*/2M x) (9.29) 
Wi = €26(v—Q?/2Mz) (9.30) 


where we have set m; = xM. At large v and Q? it is assumed that the contributions from 
different partons add incoherently in cross section. Thus, to obtain the total contribution 
from all quark partons, we must sum over the contributions from all types of partons, 7, and 
integrate over all values of x, the momentum fraction carried by the parton. The integral 
over x must be weighted by the probability f;(a) for the parton of type i to have a fraction «x 
of momentum. These probability distributions—or parton distribution functions (PDFs)— 
are not predicted by the model and are, in this parton picture, fundamental parameters of 
the proton. The structure function W2 becomes 


1 
W2(v, Q?) = D dz fi(x)e?6(v — Q?/2M a). (9.31) 
7 Jo 
Using the result for the Dirac 6-function (see appendix E, equation (E.34)) 
(x — Xo) 
[a= w 9.32 
(ole) = (9.32) 
where xo is defined by g(x) = 0, we can rewrite 
ô(v — Q?/2Ma) = (2/v)6(2 — Q?/2Mv) (9.33) 


under the x integral. Hence we obtain 


yWa(v, Q?) = a ex file) = Fo(z) (9.34) 
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which is the desired scaling behaviour. Similar manipulations lead to 
MW\(v, Q?) = Fi (£x) (9.35) 


where 
2uF\ (x) = F(x). (9.36) 


This relation between F; and F> is called the Callan—Gross relation (see Callan and 
Gross 1969). It is a direct consequence of our assumption of spin-4 partons. The physical 
origin of this relation is best discussed in terms of virtual photon total cross sections for 
transverse (A = +1) virtual photons and for a longitudinal/scalar (A = 0) virtual photon 
contribution. The longitudinal/scalar photon is present because q? Æ 0 for a virtual photon 
(see comment (4) in section 8.3.1). However, in the discussion of polarization vectors a slight 
difference occurs for space-like q?. In a frame in which 


q” = (q°, 0,0, g?) (9.37) 


the transverse polarization vectors are as before 


e#(\ = +1) = +271/2(0, 1, +i, 0) (9.38) 


with normalization (see equation (7.87)) 
e-e=-l. (9.39) 
To construct the longitudinal/scalar polarization vector, we must satisfy 

q:€=0 (9.40) 
and so are led to the result 


(à = 0) = (1//Q?)(q°, 0,0, 9°) (9.41) 


with 
(A =0) =41. (9.42) 


The precise definition of a virtual photon cross section is obviously just a convention. It is 
usually taken to be 
a\(yp > X) = (4ra / K) (Ae AW.” (9.43) 


by analogy with the total cross section for real photons of polarization À incident on an 
unpolarized proton target. Note the presence of the factor W“” defined in (9.3). The factor 
K is the flux factor; for real photons, producing a final state of mass W, this is just the 
photon energy in the rest frame of the target nucleon: 


K = (W° — M”)/2M. (9.44) 


In the so-called Hand convention, this same factor is used for virtual photons which produce 
a final state of mass W. With these definitions we find (see problem 9.3) that the transverse 
(A = £1) photon cross section 


aE (E) 3 EAJ AJW (9.45) 


is given by 
or = (41? a/K)W, (9.46) 
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and the longitudinal/scalar cross section 
og = (40? 0/K)e} (A = O)e,(A = 0) WY” (9.47) 


by 
og = (4n7a/K)[(1 + v?/Q?)We — Wi]. (9.48) 


In fact these expressions give an intuitive explanation of the positivity properties of W; and 
W2, namely 


W,>0 (9.49) 
(1+1?/Q?)W2 — W: > 0. (9.50) 


The combination in the A = 0 cross section is sometimes denoted by Wi: 
Wi = (1+v?/Q?)W. — Wi. (9.51) 
The scaling limit of these expressions can be taken using 


vW =} Fo (9.52) 
MW, > A (9.53) 


and x = Q?/2Mv finite, as Q? and v grow large. We find 


op > TA F(a) (9.54) 
and 
as > (4r°a/M K)(1/2x)(Fz — 2z F) (9.55) 


where we have neglected a term of order MF /v in the last expression. Thus the Callan- 
Gross relation corresponds to the result 


os/or >0 (9.56) 


in terms of photon cross sections. 
A parton calculation using point-like spin-0 partons shows the opposite result, namely 


or/as —> 0. (9.57) 


Both these results may be understood by considering the helicities of partons and photons in 
the so-called parton Breit or ‘brick-wall’ frame. The particular frame is the one in which the 
photon and parton are collinear and the 3-momentum of the parton is exactly reversed by the 
collision (see figure 9.6). In this frame, the photon transfers no energy, only 3-momentum. 


FIGURE 9.6 
Photon—parton interaction in the Breit frame. 
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FIGURE 9.7 
1 


Angular momentum balance for absorption of photon by helicity-conserving spin-5 parton. 


The vanishing of transverse photon cross sections for scalar partons is now obvious. The 
transverse photons bring in +1 units of the z-component of angular momentum: spin-0 
partons cannot absorb this. Thus only the scalar A = 0 cross section is non-zero. For spin- 
4 partons the argument is slightly more complicated in that it depends on the helicity 
properties of the y, coupling of the parton to the photon. As is shown in problem 9.4, 
for massless spin-4 particles the y, coupling conserves helicity—i.e. the projection of spin 
along the direction of motion of the particle. Thus in the Breit frame, and neglecting parton 
masses, conservation of helicity necessitates a change in the z-component of the parton’s 
angular momentum by +1 unit, thereby requiring the absorption of a transverse photon 
(figure 9.7). The Lorentz transformation from the parton Breit frame to the ‘laboratory’ 
frame does not affect the ratio of transverse to longitudinal photons, if we neglect the 
parton transverse momenta. These arguments therefore make clear the origin of the Callan— 
Gross relation. Experimentally, the Callan—Gross relation is reasonably well satisfied in that 
R=os/or is small for most, if not all, of the deep inelastic regime (figure 9.8). This leads 
us to suppose that the electrically charged partons coupling to photons have spin-5. 


9.3 Partons as quarks and gluons 


We now proceed a stage further, with the idea that the charged partons are quarks (and 
anti-quarks). If we assume that the photon only couples to these objects, we can make more 
specific scaling predictions. The quantum numbers of the quarks have been given in Table 
1.2. For a proton we have the result (cf (9.34)) 


F” (2) = x$ [u(x) +u(x)| + s[d(z) + d(x) + s(x) + 3(x)|+---} (9.58) 


where u(x) is the probability distribution for u quarks in the proton, ti(a) for u anti-quarks 
and so on in an obvious notation, and the dots indicate further possible flavours. So far, we 
do not seem to have gained much, replacing one unknown function by six or more unknown 
functions. The full power of the quark parton model lies in the fact that the same distribution 
functions appear, in different combinations, for neutron targets, and in the analogous scaling 
functions for deep inelastic scattering with neutrino and antineutrino beams (see volume 2). 
For electron scattering from neutron targets we can use J-spin invariance (see for example 
Close 1979, or Leader and Predazzi 1996) to relate the distribution of u and d quarks in a 
neutron to the distributions in a proton, and similarly for the antiquarks. The results are 


uP (x) = d(x) =u(ac) (x)= u(x) = d(x) (9.59) 
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FIGURE 9.8 

The ratio 2x F; /F>: 0, 1.5 < Q? < 4 GeV’; e, 0.5 < Q? < 11 GeV’; x, 12 < Q? < 16 GeV’. 
(Figure from D H Perkins Introduction to High Energy Physics 3rd edn, copyright 1987; 
reprinted by permission of Pearson Education, Inc., Upper Saddle River, NJ.) 


P(x) = @W(x)=d(x) W(x) =a (x) = A(z) (9.60) 
s(x) = (x)= s(x) (x)= 3"(x) = F(a). (9.61) 


Hence the scaling function for en scattering may be written 
FP (x) = x{$[d(x) + d(x)| + $[u(x) + u(x) + s(x) + 3(x)] +--+}. (9.62) 


The quark distributions inside the proton and neutron must satisfy some constraints. 
Since both proton and neutron have strangeness zero, we have a sum rule (treating only u, 
d and s flavours from now on) 


| mee een (9.63) 
Similarly, from the proton and neutron charges we obtain two other sum rules: 
f “da {ilua tae- ey = 1 (9.64) 
f eoa = 0. (9.65) 
These are equivalent to the sum rules 
2 = $ dz [u(z) — a(2)] (9.66) 
¢ = as (d(x) — d(a)] (9.67) 
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which are, of course, just the excess of u and d quarks over anti-quarks inside the proton. 
Testing these sum rules requires neutrino data to separate the various structure functions, 
as we shall explain in volume 2, chapter 20. 

One can gain some further insight if one is prepared to make a model. For example, 
one can introduce the idea of ‘valence’ quarks (those of the elementary constituent quark 
model) and ‘sea’ quarks (qq pairs created virtually). Then, in a proton, the u and d quark 
distributions would be parametrized by the sum of valence and sea contributions 


u = uy+qs (9.68) 


while the anti-quark and strange quark distributions are taken to be pure sea 
ūü=d=s5s=5= q5 (9.70) 


where we have assumed that the ‘sea’ is flavour-independent. Such a model replaces the 
six unknown functions now in play by three, and is consequently more predictive. The 
strangeness sum rule (9.63) is now satisfied automatically, while (9.66) and (9.67) are sat- 
isfied by the valence distributions alone: 


J. Tig uy (a) 


[ wat = 1 (9.72) 
0 


II 
© 


(9.71) 


One more important sum rule emerges from the picture of xf;(x) as the fractional 
momentum carried by quark 7. This is the momentum sum rule 


f dz x[u(x) + u(x) + d(x) + d(x) + s(x) + 5(z)] = 1 — e (9.73) 


where e is interpreted as the fraction of the proton momentum that is not carried by quarks 
and antiquarks. The integral in (9.73) is directly related to v and P cross sections, and its 
evaluation implies e ~ 4 (the CHARM (1981) result was 1 — e = 0.44 0.02). This suggests 
that about half the total momentum is carried by uncharged objects. These remaining 
partons are identified with the gluons of QCD. They have their own PDF, g(x). 

An enormous effort, both experimental and theoretical, has gone into determining the 
parton distribution functions. The subject is regularly reviewed by the Particle Data Group 
(currently Workman et al. 2022). Figure 9.9 shows the result of one analysis. In this much 
more sophisticated approach, which includes higher order QCD corrections, it is necessary 
to specify a particular value of Q? (here denoted by Q? = pu?) at which the distributions 
are defined, as explained in chapter 15 of volume 2. The distributions at this value are 
quantities to be determined from experiment. The distributions at other values of Q? are 
then predicted by perturbative QCD. 

The main features of the PDFs shown in figure 9.9 are: the valence quark distributions 
are peaked at around x = 0.2, and go to zero for x —> 0 and x — 1; the sea quarks, on 
the other hand, have a high probability of carrying very low momentum fractions, as do 
the gluons—in fact, the gluons dominate for x below about 0.1. This is then the picture of 
‘what nucleons are made of’, as revealed by some 40 years of research. 
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FIGURE 9.9 

Distributions of x times the unpolarized parton distribution functions f(x) (where 
f = wy,dy,it,d,s,c,b,g) and their associated uncertainties using the MSHT20NNLO 
parametrization (Bailey et al. 2021) at a scale u? = 10 GeV? and u? = 10,000 GeV’. 
[Figure reproduced from the review of Structure Functions by E C Aschenauer, R S Thorne 
and R Yoshida, section 18 in the Review of Particle Physics, R L Workman et al. (Particle 
Data Group) Prog. Theor. Exp. Phys. 2022 083C01 (2022)] 


9.4 The Drell-Yan process 


Much of the importance of the parton model lies outside its original domain of deep inelastic 
scattering. In deep inelastic scattering it is possible to provide a more formal basis for the 
parton model in terms of light-cone and short-distance operator expansions (see chapter 18 
of Peskin and Schroeder 1995). The advantage of the parton formulation lies in the fact that 
it suggests other processes for which a parton description may be relevant but for which 
formal operator arguments are not possible. One such example is the Drell-Yan process 
(Drell and Yan 1970) 

ptpoutp +X (9.74) 


in which a p* u” pair is produced in proton-proton collisions along with unobserved hadrons 
X, as shown in figure 9.10. The assumption of the parton model is that in the limit 


s= œ with T = q°/s finite (9.75) 


the dominant process is that shown in figure 9.11: a quark and anti-quark from different 
hadrons are assumed to annihilate to a virtual photon which then decays to a pty pair 
(compare figures 9.3 and 9.4), the remaining quarks and anti-quarks subsequently emerging 
as hadrons. 

Let us work in the CM system and neglect all masses. In this case we have 


pi = (P,0,0; P) p$ = (P,0,0,—P) (9.76) 


and 
s =4P°. (9.77) 
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FIGURE 9.10 
Drell-Yan process. 


Neglecting quark masses and transverse momenta, we have quark momenta 


Pe. = ai (P,0,0,?) (9.78) 
ph, = %2(P,0,0,—P) oe 
and the photon momentum 
q = Pa + Par (9.80) 
has non-zero components 
q = (#:+22)P (9.81) 
Ê = (zı —22)P. (9.82) 
Thus we find 
q? = 4a102P* (9.83) 
and hence 
q =Q? s = tita. (9.84) 


The cross section for the basic process 
qq > utu (9.85) 
is calculated using the result of problem 8.18. Since the QED process 


Te > pp (9.86) 


e 


FIGURE 9.11 
Parton model amplitude for the Drell-Yan process. 
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has the cross section (neglecting all masses) 
a(ete” > tp) = 4ra? /3q? (9.87) 
we expect the result for a quark of type a with charge ea (in units of e) to be 
(dada > HYH) = (4ra? /3q”)e%. (9.88) 


To obtain the parton model prediction for proton—proton collisions, one merely multiplies 
this cross section by the probabilities for finding a quark of type a with momentum fraction 
zı, and an anti-quark of the same type with fraction x2, namely 


da(@1) dary Ga(#2) dx. (9.89) 


There is, of course, another contribution for which the anti-quark has fraction xı and the 
quark z2: 
Ga(X1) dry qa(@2) d2. (9.90) 


Thus the Drell-Yan prediction is 


d?o(pp > wtp +X) 
7 Ata? (9.91) 
= p 


5 ealqa(£1)ga (£2) + Ja(£1)qa(£2)] dari dar 


a 


where we have included a factor 3 to account for the colour of the quarks: in order to 
make a colour singlet photon, one needs to match the colours of quark and anti-quark. 
Equation (9.91) is the master formula. Its importance lies in the fact that the same quark 
distribution functions are measured in deep inelastic lepton scattering so one can make 
absolute predictions. For example, if the photon in figure 9.11 is replaced by a W(Z), one 
can predict W(Z) production cross sections, as we shall see in volume 2. 

We would expect some ‘scaling’ property to hold for this cross section, following from 
the point-like constituent cross section (9.88). One way to exhibit this is to use the variables 
q? and xp = x1 — £2 as discussed in problem 9.6. There it is shown that the dimensionless 
quantity 

4 o 
4 dq? drr 
should be a function of xp and the ratio rT = q?/s. The data bear out this prediction 
well—see figure 9.12. 

Furthermore, the assumption that the lepton pair is produced via quark-anti-quark 
annihilation to a virtual photon can be checked by observing the angular distribution of 
either lepton in the dilepton rest frame, relative to the incident proton beam direction. This 
distribution is expected to be the same as in ete~ — utu, namely (cf (8.194)) 


(9.92) 


da /dQ x (1 + cos? 0) (9.93) 


as is indeed observed (figure 9.13). Note that figure 9.13 provides evidence that the quarks 
have spin-$: if they are assumed to have spin-0, the angular distribution would be (see 
problem 9.7) proportional to (1 — cos? 0), and this is clearly ruled out. 


3QCD corrections make the connection more complicated, but still perturbatively computable. 
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FIGURE 9.12 
The dimensionless cross section M?d?a/dMdzp (M = y/q?) at ap = 0 for pN scattering, 
plotted against /7 = M/,/s (Scott 1985): e, \/s = 62 GeV; O, 44; O, 27.4; O, 23.8. 
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FIGURE 9.13 

Angular distribution of muons, measured in the utu rest frame, relative to the incident 
beam direction, in the Drell-Yan process. (Figure from D H Perkins Introduction to High 
Energy Physics 3rd edn, copyright 1987; reprinted by permission of Pearson Education, 
Inc., Upper Saddle River, NJ.) 
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FIGURE 9.14 
ete” annihilation to hadrons in one-photon approximation. 


9.5 ete- annihilation into hadrons 


The last electromagnetic process we wish to consider is electron—positron annihilation into 
hadrons (figure 9.14): 


+ 


ete > X. (9.94) 


As usual, the dominance of the one-photon intermediate state is assumed. Figure 9.14 is 
clearly a generalization of figure 8.9, the latter describing the particular case in which the 
final hadronic state is t+. As a preliminary to discussing (9.94), let us therefore revisit 
ete” > ata first. 

The O(e?) amplitude is given in equation (8.159). We shall simplify the calculation by 
neglecting both the electron and the pion masses. The spinor part of the amplitude is then 
—20(k1)piu(k), and the ‘L-T’ product is 16(k-p1)(k1 : pı). Borrowing the general CM cross 
section formula (6.129) from chapter 6 as in (8.121), and including the pion form factor, we 
obtain for the unpolarized CM differential cross section 


do F?(q°)a? 2 


and the total unpolarized cross section is 


2ra? 


= — 27.2 
g= F(q E 


(9.96) 
The cross section & contains a 1/q? factor, just like that for ete” > utu” as in (9.87), but 
this ‘pointlike’ behaviour is modified by the square of the formfactor, evaluated at time-like 
q?. When the measured @ is plotted against q? for q? < 1 (GeV)?, a pronounced resonance is 
seen at q? ~ m2, superimposed on the smooth 1/q? background, where m, is the mass of the 
rho resonance (J? = 1~qq state). The interpretation of this is shown in figure 9.15. F(q?) 
should therefore be parametrized as a resonance, as in (6.107)—or a more sophisticated 
version to take account of the fact that the 7’s are emitted in an £ = 1 state. Just as F?(q?) 
modified the point-like cross section in the space-like region for er" — e~2*, so here it 
modifies the point-like (~ 1/q?) behaviour in the time-like region. 

Returning now to the process (9.94), the cross section for it is shown as a function of 
CM energy (q?)!/? in figure 9.16. The general point-like fall-off as 1/q? is seen, with peaks 
due to a succession of boson resonances superimposed (p, J/7, Y, Z°,...). The 1/q? fall-off 
is suggestive of a (point-like) parton picture and indeed the process (9.94) is similar to the 
Drell-Yan one: 

pp > pty +X. (9.97) 
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FIGURE 9.15 
p-dominance of the pion electromagnetic form factor in the time-like (q? > 0) region. 


It is natural to imagine that at large q? the basic subprocess is quark—anti-quark pair 
creation (figure 9.17). The total cross section for qq pair production is then (cf (9.88)) 


a(ete” —> qafa) = (4ra? /3q")e?. (9.98) 
In the vicinity of mesonic resonances such as the p, we can infer that the dominant compo- 
nent in the final state is that in which the qq pair is strongly bound into a mesonic state, 
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FIGURE 9.16 

The cross section o for the annihilation process e*e7 — hadrons, and the ratio R (see equa- 
tion (9.100), as a function of cm energy. [Figure reproduced courtesy Michael Barnett, for 
the Particle Data Group, from the Review of Particle Physics, K Nakamura et al. (Particle 
Data Group) Journal of Physics G 37 (2010) 075021 IOP Publishing Limited.] 
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FIGURE 9.17 
Parton model subprocess in e*e~ — hadrons. 


which then decays into hadrons. Away from resonances, and increasingly at larger values of 
q’, the produced qq pair seek to separate from the interaction region. As they draw apart, 
however, the interaction between them increases (recall section 1.3.6), producing more qq 
pairs, together with radiated gluons. In this process, the coloured quarks and gluons even- 
tually must form colourless hadrons, since we know that no coloured particles have been 
observed (‘confinement of colour’). If one assumes that the presumed colour confinement 
mechanism does not affect the prediction (9.98), then we arrive at the result 


a(ete~ — hadrons) = (47a?/3q") 5 e (9.99) 
at large q?, where ‘a’ includes all flavours produced at that energy. 


This model is best tested by taking out the dominant 1/q? behaviour and plotting the 
ratio 


R= a(ete~ — hadrons) = 5 e, (9.100) 
a(ete™ > pty) 7 
For the light quarks u, d and s occurring in three colours, we therefore predict 
R = 3[(2)? + (4? + (-4)] = 2. (9.101) 


Above the c threshold but below the b threshold we expect R = 2, and above the b 
threshold R = 4, These expectations are in reasonable accord with experiment, especially 
at energies well beyond the resonance region and the b threshold, as figure 9.16 shows. In 
this figure the dotted curve is the prediction of the quark-parton model, equation (9.99). 
The solid curve includes perturbative QCD corrections, which we will return to in chapter 
15 of volume 2. 

The success of this prediction leads one to consider more detailed consequences of the 
picture. For example, the angular distribution of massless spin-4 quarks is expected to be 
(cf (8.194) again) 

da /dQ = (a? /4q”)e2 (1 + cos? 0) (9.102) 


just as for the ~*~ process. However, in this case there is an important difference: the 
quarks are not observed! Nevertheless a remarkable ‘memory’ of (9.102) is retained by the 
observed final-state hadrons. Experimentally one observes events in which hadrons emerge 
from the interaction region in two relatively well-collimated cones or ‘jets’—see figure 9.18. 
The distribution of events as a function of the (inferred) angle of the jet axis is shown in 
figure 9.19 and is in good agreement with (9.102). The interpretation is that the primary 
process is ete — qq, the quark and the anti-quark then turning into hadrons as they 
separate and experience the very strong colour forces, but without losing the memory of 
the original quark angular distribution. 
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FIGURE 9.18 
Two-jet event in ete” annihilation from the TASSO detector at the ete” storage ring 
PETRA. 
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FIGURE 9.19 


Angular distribution of jets in two-jet events, measured in the two-jet rest frame, relative to 
the incident beam direction, in the process ete~ — two jets (Althoff et al. 1984). The full 
curve is the (1 + cos? 0) distribution. Since it is not possible to say which jet corresponded 
to the quark and which to the anti-quark, only half the angular distribution can be plotted. 
The asymmetry visible in figure 8.20(b) is therefore not apparent. 


Problems 


9.1 The various normalization factors in equations (9.3) and (9.11) may be checked in the 
following way. The cross section for inclusive electron—proton scattering may be written 
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(equation (9.11)): 


Ara\* 1 dk’ 
= 4r M L y WH” — Al 
do ( r ) A(k- p)? — m? M?) TM LyyW 2 (27)? (9.103) 


in the usual one-photon exchange approximation, and the tensor W“” is related to hadronic 
matrix elements of the electromagnetic current operator by equation (9.3): 


1 1 . 
2 pv = a ` >u X: / 
eW*" (q,p) GM? > (p; p, $|Fbn (0)|X; p) 


x (X;p'|7¥n (0) |p; p, 8) (27)*8 (p + q — p') 


where the sum X is over all possible hadronic final states. If we consider the special case of 
elastic scattering, the sum over X is only over the final proton’s degrees of freedom: 


v 1 1 gi “Ap 
ewe = TRAE De (Pi P: Slim (O) IPs P's 8") (Ds p, 3 lea (0) IPs p, 8) 
1 dp’ 
x (2)*6*(p +q- p') (27) 2E" 


Now use equation (8.208) with Fı = 1 and « = 0 (i.e. the electromagnetic current matrix 
element for a ‘point’ proton) to show that the resulting cross section is identical to that for 
elastic eu scattering. 

9.2 


a) Perform the contraction L,,,W*” for inclusive inelastic electron—proton scatterin, 
u g 
(remember q“ Luv = q” Luv = 0). Hence verify that the inclusive differential cross 
section in terms of ‘laboratory’ variables, and neglecting the electron mass, has 
the form 
d?o A a? 
dQdk’ 4k? sin*(0/2) 


[W2 cos?(0/2) + W,2sin?(6/2)]. 


(b) By calculating the Jacobian 


J= po A 


ðv/ðx Ov/dOy 
for a change of variables (x, y) > (u,v) 
du dv = |J|dz dy 


find expressions for d?a/dQ? dv and d?o /dz dy, where Q? and v have their usual 
significance, and x is the scaling variable Q?/2Mv and y = v/k. 


9.3 Consider the description of inelastic electron—proton scattering in terms of virtual pho- 
ton cross sections: 


(a) In the ‘laboratory’ frame with 
p" = (M,0,0,0) and q” = (4°, 0,0, 4") 


evaluate the transverse spin sum 
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Hence show that the ‘Hand’ cross section for transverse virtual photons is 


op = (40? a/K)W,. 


(b) Using the definition 
e5 = (1/V'Q?)(q°, 0,0, 4°) 
and rewriting this in terms of the ‘laboratory’ 4-vectors p” and q”, evaluate the 
longitudinal/scalar virtual photon cross section. Hence show that 


-K ? 
~ Ara Q? + v? 


Wo (og f Or). 


9.4 In this problem, we consider the representation of the 4 x 4 Dirac matrices in which 


(see (3.40)) 
amaa) Aa 


1 0 


Define also the 4 x 4 matrix 75 = k 4 


) and the Dirac four-component spinor 


u= (e) Then the two-component spinors @, x satisfy 


o:pọp = E~—mx 
o-px = —-Ex+me¢. 


(a) Show that for a massless Dirac particle, o and x become helicity eigenstates (see 
section 3.3) with positive and negative helicity respectively. 
(b) Defining 
1 1— 
sU p_1-% 
2 2 
show that P = P? = 1, PrP, = 0 = PLPR, and that PR + PL = 1. Show also 


that a=) aA 


and hence that Pr and P, are projection operators for massless Dirac particles, 
onto states of definite helicity. Discuss what happens when m # 0. 


Pr 


(c) The general massless spinor u can be written 
u = (PL + Pr)u = uL + ur 
where uy, ur have the indicated helicities. Show that 
tytu = ULY uL + try" ur 


where ū, = ul7°, ūr = uhy; and deduce that in electromagnetic interactions 
of massless fermions helicity is conserved. 


(d) In weak interactions an axial vector current tiy"y5u also enters. Is helicity still 
conserved? 


(e) Show that the ‘Dirac’ mass term mab may be written as m(wpor + bru). 
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9.5 In the HERA colliding beam machine, positrons of total energy 27.5 GeV collide head 
on with protons of total energy 820 GeV. Neglecting both the positron and the proton rest 
masses, calculate the centre-of-mass energy in such a collision process. 

Some theories have predicted the existence of ‘leptoquarks’, which could be produced at 
HERA as a resonance state formed from the incident positron and the struck quark. How 
would a distribution of such events look, if plotted versus the variable x? 


9.6 


(a) By the expedient of inserting a 6-function, the differential cross section for Drell- 
Yan production of a lepton pair of mass ,/q? may be written as 


da do 2 
3 = fav, dz2 ia — 8@1X2). 


Show that this is equivalent to the form 


do 4ra? 
dg = a fa day £1.220(4122 — T) 


x 2 ea lda (21)ga (£2) + ga (x1 )qa(22)] 


which, since q? = s7, exhibits a scaling law of the form 
s’da/dq? = F(r). 


(b) Introduce the Feynman scaling variable 
TF = T1 — T2 
with 
q? = st1T2 


and show that 
dq? dap = (41 + £2)sdzı dro. 
Hence show that the Drell-Yan formula can be rewritten as 
d?o 4ra? T 


= 2 = _ 
d@dzp 944 (2. + 47)? Dealdal 1 )da(r2) + Ga(%1)da(x2)]. 


9.7 Verify that if the quarks participating in the Drell-Yan subprocess qq > y > ppl 
had spin-0, the CM angular distribution of the final zt u` pair would be proportional to 
(1 — cos? @). 
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Loops and Renormalization I: The ABC Theory 


We have seen how Feynman diagrams represent terms in a perturbation theory expansion 
of physical amplitudes, namely the Dyson expansion of section 6.2. Terms of a given order 
all involve the same power of a ‘coupling constant’, which is the multiplicative constant 
appearing in the interaction Hamiltonian—for example, ‘g’ in the ABC theory, or the charge 
‘e’ in electrodynamics. In practice, it often turns out that the relevant parameter is actually 
the square of the coupling constant, and factors of 47 have a habit of appearing on a regular 
basis; so, for QED, the perturbation series is conveniently ordered according to powers of 
the fine structure constant a = e? /4r ~ 1/187. 

Equivalently, this is an expansion in terms of the number of vertices appearing in the 
diagrams, since one power of the coupling constant is associated with each vertex. For a 
given physical process, the lowest-order diagrams (the ones with the fewest vertices) are 
those in which each vertex is connected to every other vertex by just one internal line; 
these are called tree diagrams. The Yukawa (u-channel) exchange process of figure 6.4, 
and the s-channel process of figure 6.5, are both examples of tree diagrams, and indeed 
all of our calculations so far have not gone further than this lowest-order (‘tree’) level. 
Admittedly, since a is after all pretty small, tree diagrams in QED are likely to give us a 
good approximation to compare with experiment. Nevertheless, a long history of beautiful 
and ingenious experiments has resulted in observables in QED being determined to an 
accuracy far better than the O(1%) represented by the leading (tree) terms. More generally, 
precision experiments at LEP, the LHC, and other laboratories have an accuracy sensitive 
to higher-order corrections in the Standard Model (SM). Hence, some understanding of the 
physics beyond the tree approximation is now essential for phenomenology. 

All higher-order processes beyond the tree approximation involve loops, a concept easier 
to recognize visually than to define in words. In section 6.3.5 we have already seen (figure 6.8) 
one example of an O(g*) correction to the O(g?) C-exchange tree diagram of figure 6.4, which 
contains one loop. The crucial point is that whereas a tree diagram can be cut into two 
separate pieces by severing just one internal line, to cut a loop diagram into two separate 
pieces requires the severing of at least two internal lines. 

In these last two chapters, we aim to provide an introduction to higher-order processes, 
confining ourselves to ‘one-loop’ order. In the present chapter we shall concentrate mainly on 
the particular loop appearing in figure 6.8. This will lead us into the physics of renormaliza- 
tion for the ABC theory, which—as a Yukawa-like theory—is a good theoretical laboratory 
for studying ‘one-loop physics’, without the complications of spinor and gauge fields. In the 
following chapter, we shall discuss one-loop diagrams in QED, emphasizing some important 
physical consequences, such as corrections to Coulomb’s law, anomalous magnetic moments 
and the running coupling constant. 
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FIGURE 10.1 
O(g*) contribution to the process A + B — A + B, involving the modification of the C 
propagator by the insertion of a loop. 


10.1 The propagator correction in ABC theory 
10.1.1 The O(g?) self-energy 112! (q?) 


We consider figure 6.8, reproduced here again as figure 10.1. In section 6.3.5, we gave the 
extra rule (‘(iii)’) needed to write down the invariant amplitude for this process. We first 
show how this rule arises in the special case of figure 10.1. 

Clearly, figure 10.1 is a fourth-order process, so it must emerge from the term 


(—ig)* 4 4 4 4 ^ INA / 
7i d*xı d*z2 d*z3 d* x4 (0|âa (p's )âB (pp) 


xT {ba (21) bp (21)bc(21).-.ba(wa)bn(wa)bo(xa)} 
x @\ (pa)af, (pp)|0) (162, Eg Eh Eh)? (10.1) 


of the Dyson expansion. Since it is basically a u-channel exchange process (u = (pa — ph)? = 
(p', —pp)”), the vev’s involving the external creation and annihilation operators must appear 
as they do in equation (6.89) (‘ingoing A, outgoing B’ at one point x2; ingoing B, outgoing 
A’ at another point 71’) rather than as in equation (6.88) (‘ingoing A and B at z2; outgoing 
A’ and B’ at 2’). In (10.1), however, we unfortunately have four space-time points to 
choose from, rather than merely the two in (6.74). Figuring out exactly which choices are 
in fact equivalent and which are not is best left to private struggle, especially since we are 
not seriously interested in the numerical value of our fourth-order corrections in this case. 
Let us simply consider one choice, analogous to (6.89). This yields the amplitude (cf (6.91)) 


(—ig)* JI dzi dtas dtx3 diza ei(Pa—PB)-@1 eil PB Pa) T2 
x(0|T{ĝc(x1)ĝc(z2)ĝa (£3)$B(z3)ĝc(£3)ĝa (x4)B(z4)ĝc(x4)}10) 
(10.2) 
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FIGURE 10.2 
The space-time structure of the integrand in (10.3). 


and we have discarded the numerical factor 1/4!. Once again, there are many terms in the 
expansion of the vev of the eight operators in (10.2). But, with an eye on the structure of 
the Feynman amplitude at which we are aiming (figure 10.1), let us consider again just a 
single contribution 


(—ig)* H dx, dfx dtz dtz4 el(P’a TPB) T1 oi(Pp=pa) 22 


x (OJT ($o (z1)$c(z3))10) (0|T ($c (z2) 
x (0|T (ba (z3)$a (z4))10) (0T (fB (z3) 


which contains four propagators connected as in figure 10.2. 

As we saw in section 6.3.2, each of these propagators is a function only of the difference of 
the two space-time points involved. Introducing relative coordinates £ = £1— £3, Y = 2—14, 
z = £3 — x4, and the CM coordinate X = $(x1 + £2 + x3 + x4), we find (problem 10.1) that 
(10.3) becomes 


be (w2)dc(24))| 
op (wa 


T 
T (x4))10) (10.3) 


(ig)? T d4*x dfr dfy dz ilps +P'3—PA—PB)-X pilpa- pe) (3x—y+2z)/4 
x lob-par(-2480-29)/4 Do (2) Doly) Da (2)De(2) (10.4) 


where D; is the position-space propagator for type-i particles (i = A,B,C), defined as 
in (6.98). The integral over X gives the expected overall 4-momentum conservation factor, 
(27)*6*(p'y +ph—pa — pp). Setting q = pa — ph = ph —pp (where 4-momentum conservation 
has been used), (10.4) becomes 


(~ig) (2r)*8 (p'a + ph — Pa — pB) TJ dtz dty dz íT? De(x) 
xe~ "TY Do(y)e'** Da (z)Dg(2). (10.5) 


The integrals over x and y separate out completely, each being just the Fourier transform 
of a C propagator—that is, the momentum-space propagator Dco(q). Since the latter is a 
function of q? only, we end up with two factors of i/(q? — m2, + ie), corresponding to the 
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two C propagators in the momentum-space Feynman diagram of figure 10.1. Note that 
the Mandelstam u-variable is defined by u = (pa — ph)? and is thus equal to q?; we shall, 
however, continue to use q? rather than u in what follows. 

The remaining factor represents the loop. Including (—ig)? for the two vertices in the 
loop, it is given by 


(ig)? f atzel**Da(2)Dal2) (10.6) 
which is the main result of our calculation so far. Since we want to end up finally with 


a momentum-space amplitude, let us introduce the A and B propagators in momentum 
space, and write (10.6) as (cf (6.99)) 


. dtk . i dtk ; i 
ERN d4 a 1 | —iki-z J 2 —iko-z 
(ig) i ae Or) k? — m4 + ie Or kz — mz, + ie 


x ffs i dfkə i i 
4 (Qr)4 k? — m4 +ie ki — m +ie 


x (27 Jo (ky — q) 


dk i i 
= (-ig)” 10. 
(=i Ton k2 —m +ie(q— k)? — m3 +ie (107) 


= iP (¢), (10.8) 


where we have defined the function ine! (q?) as the loop (or ‘bubble’) amplitude appearing 
in figure 10.1. It is a function of q? as follows from Lorentz invariance. The Pl refers to the 
two powers of g, as will be explained shortly, after (10.15). 

Careful consideration of the equivalences among the various contractions shows that the 
amplitude corresponding to figure 10.1 is, in fact, just the simple expression 
sinha) (10.9) 
— mg + ie q? — mê + ie 


i 


(—ig)?(2m)*6*(p, + ph — pa pe) 


where uf! (@?) is given in (10.8). We see that whereas the ‘single-particle’ pieces, involving 
one C-exchange, do not involve any integral in momentum-space, the loop (which involves 
both A and B particles) does involve a momentum integral. This can be simply understood 
in terms of 4-momentum conservation, which holds at every vertex of a Feynman graph. 
At the top (or bottom) vertex of figure 10.1, the 4-momentum q of the C-particle is fully 
determined by that of the incoming and outgoing particles (q = pa — ph = P'a — pp). This 
same 4-momentum q flows in (and out) of the loop in figure 10.1, but nothing determines 
how it is to be shared between the A- and B-particles; all that can be said is that if the 
4-momentum of A is k (as in (10.7)) then that of B is q — k, so that their sum is q. The 
‘free’ variable k then has to be integrated over, and this is the physical origin of rule (iii) 
of section 6.3.5. 

We have devoted some time to the steps leading to expression (10.7), not only in order 
to follow the emergence of rule (iii) mathematically, but so as to lend some plausibility 
to a very important statement: the Feynman rules for associating factors with vertices 
and propagators, which we learned for tree graphs in chapters 6 and 8, also work, with 
the addition of rule (iii), for all more complicated graphs as well! Having seen most of 
just one fairly short calculation of a higher-order amplitude, the reader may perhaps now 
begin to appreciate just how powerful is the precise correspondence between ‘diagrams and 
amplitudes’, given by the Feynman rules. 

Having arrived at the expression for our first one-loop graph, we must at once draw 
the reader’s attention to the bad news: the integral in (10.7) is divergent at large values 
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of k. We shall postpone a more detailed mathematical analysis until section 10.3.1, but 
the divergence can be plausibly inferred just from a simple counting of powers; there are 
four powers of k in the numerator and four in the denominator, and the likelihood is 
that the integral diverges as Jè k3dk/k* ~ In A, as A — oo. This is plainly a disaster: a 
quantity which was supposed to be a small correction in perturbation theory is actually 
infinite! Such divergences, occurring as loop momenta go to infinity, are called ‘ultraviolet 
divergences’, and they are ubiquitous in quantum field theory. Only after a long struggle 
with these infinities was it understood how to obtain physically sensible results from such 
perturbation expansions. Depending on the type of field theory involved, the infinities can 
often be ‘tamed’ through a procedure known as renormalization, to which we shall provide 
an introduction in this and the following chapter. 

The physical ideas behind renormalization are, however, just as relevant in cases—such 
as condensed matter physics—where the analogous higher-order (loop) corrections are not 
infinite, though possibly large. In quantum mechanics, infinite momentum corresponds to 
zero distance, and our fields are certainly ‘point-like’. But in condensed matter physics there 
is generally a natural non-zero smallest distance—the lattice size, or an atomic diameter, for 
example. In quantum field theory, such a ‘shortest distance’ would correspond to a ‘highest 
momentum’, meaning that the magnitudes of loop momenta would run from zero up to some 
finite limit A, say, rather than infinity. Such a A is called a (momentum) ‘cut-off’. With 
such a cut-off in place, our loop integrals are of course finite—but it would seem that we 
have then maltreated our field theory in some way. However, we might well ask whether we 
seriously believe that any of our quantum field theories is literally valid for arbitrarily high 
energies (or arbitrarily small distances). The answer is surely no. We are virtually certain 
that ‘new physics’ will come into play at some stage, which is not contained in—say—the 
QED, or even the SM, Lagrangian. At what scale this new physics will enter (the Planck 
energy? 10 TeV?) we do not know, but surely the current models will break down at some 
point. We should not be too alarmed, therefore, by formal divergences as A — oo. Rather, 
it may be sensible to regard a cut-off A as standing for some ‘new physics’ scale, accepting 
some such manoeuvre as physically realistic as well as mathematically prudent. 

At the same time, however, we would not want our physical predictions, made using 
quantum field theories, to depend sensitively on A—i.e. on the unknown short-distance 
physics, in this interpretation. Indeed, theories exist (for example, those in the SM and the 
ABC theory) which can be reformulated in such a way that all dependence on A disappears, 
as A — oo; these are, precisely, renormalizable quantum field theories. Roughly speaking, 
a renormalizable quantum field theory is one such that, when formulae are expressed in 
terms of certain ‘physical’ parameters taken from experiment, rather than in terms of the 
original parameters appearing in the Lagrangian, calculated quantities will be finite and 
independent of A as A —> oo. 

Solid state physics provides a close analogy. There, the usefulness of a description of, say, 
electrons in a metal in terms of their ‘effective charge’ and ‘effective mass’, rather than their 
free-space values, is well established. In this analogy, the free-space quantities correspond 
to our Lagrangian values, while the effective parameters correspond to our ‘physical’ ones. 
In both cases, the interactions are causing changes to the parameters. 

It is clear that we need to understand more precisely just what our ‘physical parameters’ 
might be and how they might be defined. This is what we aim to do in the remainder of the 
present section, and in the next one, before returning in section 10.3 to the mathematical 
details associated with evaluating (10.7), and indicating how renormalization works for the 
self-energy. Having thus prepared the ground, we shall introduce a more powerful approach 
in section 10.4, and offer a few preliminary remarks about ‘renormalizability’ in section 10.5, 
returning to that topic at the end of the following chapter. Although usually not explicitly 
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FIGURE 10.3 
O(g°®) term in A + B —> A +B, involving the insertion of two loops in the C propagator. 


indicated, loop corrections considered in this and the following section will be understood 
to be defined with a cut-off A, so that they are finite. 

To begin the discussion of the physical significance of our O(g*) correction, (10.9), it 
is convenient to consider both the O(g?) term (6.100) and the O(g*) correction together, 
obtaining 


(—ig)?(2)*6*(p, + ph — pa — pB) 


i i [22 i 
x + ie (q \ 10.10 
aoe Pe A T! ( ) 


where the ie in the C propagators does not need to be retained. Both the form of (10.10) and 
inspection of figure 10.1 suggest that the O(g*) term we have calculated can be regarded 
as an O(g”) correction to the propagator for the C-particle. Indeed, we can easily imagine 
adding in the O(g) term shown in figure 10.3, and in fact the whole infinite series of such 
‘bubbles’ connected by simple C propagators. The infinite geometric series for the corrected 
propagator shown in figure 10.4 has the form 
ree eo ee Oe 


-mhe P- m Gg — mg 
C C 
—— + > + > > + eee 
C C eG C C e 
FIGURE 10.4 


Series of one-loop (or ‘bubble’) insertions in the C propagator. 
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A B 


FIGURE 10.5 
O(g*) contribution to Tag(q?). 


g =m qf — me qf — Me 
(10.11) 
ae (1+r+r? +e) (10.12) 
where ; 
r = IË?) / (9? — m2). (10.13) 


The geometric series in (10.12) may be summed, at least formally’, to give (1—r)~' so that 
(10.12) becomes 
i 1 i 
= 10.14) 
z ( 
P— me 1- NE ()/(-m) q? — mg — 16 (@?) 


In this form it is particularly clear that we are dealing with corrections to the simple C 
propagator i/(q? — m2). me! is called the O(g?) self-energy. 
Before proceeding with the analysis of (10.14), we note that it is a special case of the 


more general expression 
~I 1 


Dol) = Pon a (10.15) 


where Do(q?) is the complete (including all corrections) C propagator, and IIg(q?) is the 
sum of all ‘insertions’ in the C line, excluding those which can be cut into two separate bits by 
severing a single line: IIc(q’) is the one-particle irreducible self-energy and we must exclude 
all one-particle bits from it as they are already included in the geometric series summation 
(cf (10.11)). The amplitude miS which we have calculated is simply the lowest-order (O(g)) 
contribution to Hc(q?); an O(g*) contribution to IIc(q?) is shown in figure 10.5. 


10.1.2 Mass shift 


We return to the expression (10.14) which includes the effect of all the iterated O(g?) 


bubbles in the C propagator, where inte (q?) is given by 


d*k i i 
Qn)4 k? — m? + ie (q — k)? — m? + ie’ 
A B 


int 4?) = (~ig)? f (10.16) 


1Properly speaking this is valid only for |r| < 1, yet we know that 119) (q?) actually diverges! As we 
shall see, however, renormalization will be carried out after making such quantities finite by ‘regularization’ 
(section 10.3.2), and then working systematically at a given order in g (section 10.4). 
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Postponing the evaluation of (10.16) (and in particular the treatment of its divergence) 
until section 10.3, we proceed to discuss the further implications of (10.14). 

First, suppose nË were simply a constant, ôĝmĉ say. In the absence of this correction, 
we know (cf section 6.3.3) that the vanishing of the denominator of the C propagator 
would correspond to the ‘mass-shell condition’ q? = mẹ appropriate to a free particle of 
momentum q and energy qo = (q? + m2)!/?, where mc is the mass of a C particle. It 
seems very plausible, therefore, to interpret the constant ome, as a shift in the (mass)? of 
the C particle, the denominator of (10.14) now vanishing at qo = (q? + m2, + 6m2,)'/?, if 
ines ~ 6mz. The idea that the mass of a particle can be changed from its ‘free space’ value 
by the presence of interactions with its ‘environment’ is a familiar one in condensed matter 
physics, as noted above. In the case of electrons in a metal, for example, it is not surprising 
that the presence of the lattice ions, and the attendant band structure, affect the response 
of conduction electrons to external fields, so that their apparent inertia changes. In the 
present case, the ‘environment’ is, in fact, the vacuum. The process described by the bubble 
uf! (q?) is one in which a C particle dissociates virtually into an A-B pair, which then 
recombine into the C particle, no other ‘external’ source being present. As in earlier uses 
of the word, by ‘virtual’ here is meant a process in which the participating particles leave 
their mass-shells. Thus, in particular, in the expression (10.16) for nË 
the case that k? # m4, and (q — k)? 4 mg. 

In the case of the electron in a metal, both the ‘free’ and the ‘effective’ masses are 
measurable quantities. But we cannot get outside the vacuum! This strongly suggests that 
what we must mean by ‘the physical (mass)?’ of a particle in our ABC theory is not the ‘free’ 
(Lagrangian) value m?, which is unmeasurable, but the effective (mass)? which includes all 
vacuum interactions. This ‘physical (mass)?’ may be defined to be that value of q? for which 


, it will in general be 


P- m; —T(q*) =0 (10.17) 


where II;(q?) is the complete one-particle irreducible self-energy for particle type ‘i’. If we 
call the physical mass mpp,i, then, we will have q? — mẹ — II;(q?) = 0 when q? = mé, |. 
What we are dealing with in (10.14) is just the lowest-order contribution to He(q?), 


namely nË (q?°), so that in our case m3ą, œ is determined by the condition 


e-ma- m2 (q?) =0 when q? = menor (10.18) 


which (to this order) is 
2 
mpn,C =m + H eah (10.19) 


Once we have calculated 1e (see section 10.3), equation (10. 19) could be regarded as 
an equation to determine m? ph,c 12 terms of the parameter en which appeared in the 
original ABC Lagrangian. This might, indeed, be the way such an equation would be viewed 
in condensed matter physics, where we should know the values of the parameters in the 
Lagrangian. But in the field-theory case m2, is unobservable, so that such an equation has 
no predictive value. Instead, we may regard it as an equation determining (up to O(g?)) m2, 
in terms of meno thus enabling us to eliminate—to this order in g—all occurrences of the 
unobservable parameter me from our amplitudes in favour of the physical parameter mn co 
Note that nË contains two powers of g, so that in the spirit of systematic perturbation 
theory, the mass shift represented we (10.19) is a second-order correction. 
The crucial point here is that inten depends on the cut-off A, whereas the physical mass 
meno clearly does not. But there is nothing to stop us supposing that the unknown and 
unobservable Lagrangian parameter me depends on A in just such a way as to cancel 
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the A-dependence of ne), leaving mc independent of A. This is the beginning of the 
‘renormalization procedure’ in quantum field theory. 


10.1.3 Field strength renormalization 


We now need to consider the more realistic case in which nË (q”) is not a constant. Let us 


expand it about the point q? = m2, c, writing 


2 2 ar?! 

TEN (@?) = HET (myc) + (0? — myo) ae pees, (10.20) 

q q? =m2, c 

The corrected propagator (10.14) then becomes 
: (10.21) 
2 2 [2], 2 2 2 aS l 
g — mọ — lé (minc) — (q —™Mpn,c) 72 +. 

q EMÈ, c 

ph,C 
z : (10.22) 

2 2 ang! 2 2 42 
(q = minc) t= dq? 5 + O(g = ™Mpn,c) 
q’ =m n, c 


The expression (10.22) has indeed the expected form for a ‘physical C’ propagator, having 
the simple behaviour ~1/(4° — m2, c) for q? ~ mp g. However, the normalization of this 
(corrected) propagator is different from that of the ‘free’ one, i/(q? — m&), because of the 


extra factor 
| T | i 
ie 
Pam oc 


dq? 
To the order at which we are working (O(g)), it is consistent to replace this expression by 


[2] 
am 


1 
F ap 


22 
r=™Mon,0 


Let us see how this factor may be understood. 

Our O(g?) corrected propagator is an approximation to the exact propagator which 
we may write as (Q\T(¢c(21)¢c(x2))|Q), in coordinate space, where |Q) is the exact vac- 
uum. The free propagator, however, is (0|T(c(71)¢c(x2))|0) as calculated in section 6.3.2. 
Consider one term in the latter, 0(t; — t2)(0|4c(x1)¢c(a2)|0), and insert a complete set of 
free-particle states ‘1 = 5°, |n)(n|’ between the two free fields, obtaining 


Olti — t2) X Oldc(a1)In)(n|dc(w2)]0). (10.23) 


m 


The only free particle state |n) having a non-zero matrix element of the free field dc to the 
vacuum is the 1 — C state, for which (0|¢c(a)|C, k) = e~*” as we learned in chapters 5 and 
6. Thus (10.23) becomes (cf equation (6.92)) 


dk —iw = bg, ik. = 
O(ty H) | oS felt tap tik: a Wa) (10.24) 
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which is exactly the first term of equation (6.92). Consider now carrying out a similar 
manipulation for the corresponding term of the interacting propagator, obtaining 


Olti — t2) ))(Qldc(w1)In) (n]dc(w2)|Q) (10.25) 


n 


where the states |n) are now the exact eigenstates of the full Hamiltonian. The crucial 
difference between (10.23) and (10.25) is that in (10.25), multi-particle states can appear in 
the states |n). For example, the state |A, B) consisting of an A particle and a B particle will 
enter, because the interaction couples this state to the 1-C states created and destroyed in 
éc. Indeed, just such an A+B state is present in me) This means that, whereas in the free 
case the ‘content’ of the state (0|éc(x) was fully exhausted by the 1 — C state |C, k) (in 
the sense that all overlaps with other states |n) were zero), this is not so in the interacting 
case. The ‘content’ of (Q|dc(x) is not fully exhausted by the state |C, k): rather, it has 
overlaps with many other states. Now the sum total of all these overlaps (in the sense of 
S, |n) (n|’) must be unity. Thus it seems clear that the ‘strength’ of the single matrix 


element (Q|dc(x)|C, k} in the interacting case cannot be the same as the free case (where 
the single state exhausted the completeness sum). However, we expect it to be true that 
(Q\|éc(x)|C, k} is still basically the wavefunction for the C-particle. Hence we shall write 


(Q|dc(x)[C, k) = V Zoe ** (10.26) 


where Zc is a constant to take account of the change in normalization—the renormaliza- 
tion, in fact—required by the altered ‘strength’ of the matrix element. 

If (10.26) is accepted, we can now imagine repeating the steps leading from equa- 
tion (6.92) to equation (6.98) but this time for (Q|T(¢c(x1)¢c(x2))|Q), retaining explicitly 
only the single-particle state |C, k) in (10.25), and using the physical (mass)*, m2), ¢. We 
should then arrive at a propagator in the interacting case which has the form 


a z d*k ip iZi 
(ATG = f eet ae 
ph, 


+multiparticle contributions : (10.27) 


The single-particle contribution in (10.27)—after undoing the Fourier transform—has ex- 
actly the same form as the one we found in (10.22), if we identify the field strength renor- 
malization constant Zc with the proportionality factor in (10.22), to this order: 


an! 
dq? 


Zo ~ Z =14 (10.28) 


ae 
P=Mn,c 


This is how the change in normalization in (10.22) is to be interpreted. 

It may be helpful to sketch briefly an analogy between this ‘renormalization’ and a 
very similar one in ordinary quantum mechanical perturbation theory. Suppose we have a 
Hamiltonian H = Ho + V and that the |n) are a complete set of orthonormal states such 
that Hojn) = Eo In). The exact eigenstates |n) satisfy 


(Ho + V)|n) = Enn). (10.29) 


To obtain |n) and En in perturbation theory, we write 


In) = VNaln) + > cinli) (10.30) 
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FIGURE 10.6 
O(g*) contributions to A+B — A+B, involving corrections to the ABC vertices in figure 6.4. 


where, if |n) is also normalized, we have 


1= Na+ > lel’. (10.31) 
ign 


N, cannot be unity, since non-zero amounts of the states |7) (i 4 n) have been ‘mixed in’ by 
the perturbation—just as the A + B state was introduced into the summation ‘$3, |n} (n|, 
in addition to the 1—C state. Inserting (10.30) into (10.29) and taking the bracket with (j| 
yields 


g|V|n 
= oe aCe (10.32) 
Ej — En 


jn 


which is still an exact expression. The lowest non-trivial approximation to Cjn is to take 
Jn) = JN, |n) and En, = EO in (10.32), giving 


i|V Vin 
Cin % —VNn VIn) = -vV Nn (10.33) 


EO — gC pO pO 


Equation (10.31) then gives N, as 
Naw / (1 +5 Vr PEC — ze) x1- Y Val? (BO — BO)?  (10.34) 
J j 


to second order in V;,,. The reader may ponder on the analogy between (10.34) and (10.28). 


10.2 The vertex correction 


At the same order (g*) of perturbation theory, we should also include, for consistency, the 
processes shown in figures 10.6(a) and (b). Figure 10.6(a), for example, has the general 
form ; 
š 1 : 
-ig-z — 5 (—igG" (pa, pB)) (10.35) 
gq — mgo 

where —igG!! is the ‘triangle’ loop, given by an expression similar to (10.16) but with a 
factor (—ig)? and three propagators. The ‘vertex correction’ Gl?! depends on just two of 
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its external 4-momenta because the third is determined by 4-momentum conservation, as 
usual. Thus, the addition of figure 10.6(a) and the O(g?) C-exchange tree diagram gives 


: i ; . 
ig—s {-ig + (-igG"l(pa, pp))} (10.36) 
P= me, 


from which it seems plausible that GPI will contribute—among other effects—to a change 
in g. This change will be of order g?, since we may write the {...} bracket in (10.36) as 


—ig{1 + GP (pa, p's) } (10.37) 


where GPI is dimensionless and contains a g? factor—hence the superscript [2]. 

Once again, the effect of interactions with the environment (i.e. vacuum fluctuations) 
has been to alter the value of a Lagrangian parameter away from the ‘free’ value. In the case 
of g, the change is analogous to that in which an electron in a metal acquires an ‘effective 
charge’. How we define the ‘physical g’ is less clear than in the case of the physical mass and 
we shall not pursue this point here, since we shall discuss it again in the more interesting 
case of the charge ‘e’ in QED, in the following chapter. At all events, some suitable definition 
of ‘gpn’ can be given, so that it can be related to g after the relevant amplitudes have been 
computed. 

Let us briefly recapitulate progress. We are studying higher-order (one-loop) corrections 
to tree graph amplitudes in the ABC model, which has the Lagrangian density: 


£ =X {40,6:0"4; — img?) — gba dndc. (10.38) 


We have found that the loops considered so far, namely those in figures 10.1 and 10.5, have 
the following qualitative effects: 


(a) the position of the single-particle mass-shell condition becomes shifted away from 
the ‘Lagrangian’ value m? to a ‘physical’ value m2, ; given by the vanishing of 
g 8g a y ph,i 8g g 
an expression such as (10.17); 


(b) the vacuum-to-one-particle matrix elements of the fields ¢; have to be ‘renormal- 
ized’ by a factor //Z;, given by (10.28) to O(g?) for i = C, and these factors have 
to be included in S-matrix elements; 


(c) the propagators contain some contribution from two-particle states (e.g. ‘A + B’, 
for the C propagator); 


(d) the Lagrangian coupling g is shifted by the interactions to a ‘physical’ value gpn- 


Responsible for these effects were two ‘elementary’ loops, that for —iII!?] shown in fig- 
ure 10.7(a) and that for —igG!?! shown in figure 10.7(b). It is noteworthy that the effects 
(a), (b), and (d) all relate to changes (renormalizations, shifts) in the fields and parameters 
of the original Lagrangian. We say, collectively, that the ‘fields, masses and coupling have 
been renormalized’—i.e. generically altered from their ‘free’ values, by the virtual inter- 
actions represented generically by figures 10.7(a) and (b). However, whereas in condensed 
matter physics one might well have the ambition to calculate such effects from first prin- 
ciples, in the field-theory case that makes no sense. Rather, by rewriting all calculated 
expressions (at a given order of perturbation theory) in terms of ‘renormalized’ quantities, 
we aim to eliminate the ‘unknown physics scale’, A, from the theory. Let us now see how 
this works in more mathematical detail. 
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(a) (b) 


FIGURE 10.7 
Elementary one-loop amplitudes: (a) self-energy; (b) vertex correction. 


10.3 Dealing with the bad news: a simple example 


10.3.1 Evaluating n! (a°) 


We turn our attention to the actual evaluation of a one-loop amplitude, beginning with the 
simplest, which is —i?! (q*): 
4 p : 
[2]; 2 . 2 d°*k i i 
—ill = 1l A 
c (7) = (ig) | ore m? + ie (q—k)? — m3 + ie’ 
in particular, we want to know the precise mathematical form of the divergence which 
arises when the momentum integral in (10.39) is not cut off at an upper limit A. This will 
necessitate the introduction of a few modest tricks from a large armoury (mostly due to 
Feynman) for dealing with such integrals. 


The first move in evaluating (10.39) is to ‘combine the denominators’ using the identity 
(problem 10.2) 


(10.39) 


_ da 
AB Jj, [d—-—az)A+2B)? 
(similar ‘Feynman identities’ exist for combining three or more denominator factors). Ap- 
plying (10.40) to (10.39) we obtain 


sities = ff 


(10.40) 


1 
10.41 
“Td — a)(k2 — m? + ie) + 2((q— k)? — m3 F id)? ( ) 
Collecting up terms inside the [...] bracket and changing the integration variable to k’ = 
k — xq leads to (problem 10.3) 
ae 
-inf (q -=e f d e 10.42 
7 4 (kl? — x + ie)? ane) 
where 
A= —g(1 — z) + am3, + (1—2)mi. (10.43) 


The d*k’ integral means dk” d3k', and k’? = (k)? — k”. 
We now perform the k’° integration in (10.42) for which we will need the contour inte- 
gration techniques explained in appendix F. The integral we want to calculate is 


o dk”? 0 9 k’? ə 
J. [(k"0)? — Aj? = OA J. [(k'0)? = Al = za (10.44) 
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Imk” 


FIGURE 10.8 
Location of the poles of (10.42) in the complex k/°-plane. 


where A = k’? + A — ie. We rewrite I(A) as 


I(A) = im f aoa (10.45) 


where the contour Cp is the real axis from —R to R. Next, we identify the points where 
the integrand |z? — A]~! ceases to be analytic. In this case they are simple poles (see 
Appendix F) at z = +VA = +(k” + A — ie)/?. Figure 10.8 shows the location of these 
points in the complex z(k’°)-plane. Note that the ‘ie’ determines in which half-plane each 
point lies (compare the similar role of the ‘ie’ in (z + ie)~+, in the proof in appendix F of 
the representation (6.93) for the 0-function). We must now ‘close the contour’ in order to 
be able to use Cauchy’s integral formula of (F.19). We may do this by means of a large 
semicircle in either the upper (C+) or lower (C_) half-plane (again compare the discussion 
in appendix F). The contribution from either such semicircle vanishes as R — oo, since on 
either we have z = Re’, and 


dz Reido 
n ocaf cA =] Re _ A >0 as R — oo. (10.46) 


For definiteness, let us choose to close the contour in the upper half-plane. Then we are 
evaluating 
dz 
I(A)= lim 

R> Jencatc, (z — V'A)(z + VA) 
around the closed contour C shown in figure 10.9, which encloses the single non-analytic 
point at z = —vV A. Applying Cauchy’s integral formula (F.19) with a = —VA and f(z) = 
(z — VA)—}, we find 


(10.47) 


1 


I(A) = Ii (10.48) 


and thus 


A dk”? Ti 
= : 10.49 
J EPZA = 307 a 
The reader may like to try taking the other choice (C_) of closing contour, and check that 
the answer is the same. Reinstating the remaining integrals in (10.42) we have finally (as 


e> 0) 
uz 
[2 \(q du 
—illg = ys fra ef G2 +A) 2 aA 3/2 (10.50) 
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FIGURE 10.9 
The closed contour C used in the integral (10.47). 


where u = |k’| and the integration over the angles of k’ has yielded a factor of 47. We 
see that the u-integral behaves as f du/u for large u, which is logarithmically divergent, as 
expected from the start. 


10.3.2 Regularization and renormalization 


Faced with results which are infinite, one can either try to go back to the very beginnings 
of the theory and see if a totally new start can avoid the infinities or one can see if they 
can somehow be ‘lived with’. The first approach may yet, ultimately, turn out to be correct: 
perhaps a future theory will be altogether free of divergences (such theories do in fact exist, 
but none as yet successfully describes the pattern of particles and forces we actually seem to 
have in Nature). For the moment, it is the second approach which has been pursued—indeed 
with great success as we shall see in the next chapter and in volume 2. 

Accepting the general framework of quantum field theory, then, the first thing we must 
obviously do is to modify the theory in some way so that integrals such as (10.50) do not 
actually diverge, so that we can at least discuss finite rather than infinite quantities. This 
step is called ‘regularization’ of the theory. There are many ways to do this but for our 
present purposes a simple one will do well enough, which is to cut off the u-integration in 
(10.50) at some finite value A (remember u is |k’|, so A here will have dimensions of energy, 
or mass); such a step was given some physical motivation in section 10.1.1. Then we can 
evaluate the integral straightforwardly and move on to the next stage. 

With the upper limit in (10.50) replaced by A, we can evaluate the u-integral, obtaining 
(problem 10.4) 


n2 pl 2 1/2 
2 an o ~9 A+(4? +A) A 
Hé (g, A“) = ened, da {in( NE (A? + Ay (10.51) 


where from (10.43) 
A=-—2x(1—2)q? + em + (1—2z)mi. (10.52) 
Note that A > 0 for q? < 0. 
Inspection of (10.51) shows that as A + oo, nb (q?, A?) contains a divergent part 


proportional to ln A. It is useful to isolate this divergent part, as follows. For large A, we 
can expand the terms in (10.51) in powers of A/A?, writing 


A 
A+ (A? 4 A)? = 2A(1 4 qa) (10.53) 
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and A A 
TETN =1 gaT (10.54) 
It follows that 
mag an- E f 1 
c (¢,A*) ==> | dezyimA+(m2—1)—-mA (10.55) 
81? 0 2 


where terms that go to zero as A + oo have been omitted. 
Relation (10.19) then becomes 


m2,(A?) = myo — ME? = m24,0,A?) (10.56) 


and there will be similar relations for the A and B masses. As noted previously, after 
(10.19), the shift represented by (10.56) is an O(g?) perturbative correction (because rates 
contains a factor g?), so that—again in the spirit of systematic perturbation theory—it 
will be adequate to this order in g? to replace the Lagrangian masses m4, mj, and mé, 
inside the expressions for n, me, and nË by their physical counterparts. In this way the 
relations (10.56) and the two similar ones give us the prescription for rewriting the m? in 
terms of the m3, ; and A’. Of course, when this is done in the propagators, the result is 
just to produce the desired form ~(q? — m%,, ;)~*, to this order. 

So, for the propagator at this one-loop order, the effect of such mass shifts is essentially 
trivial: the large A behaviour is simply absorbed into m?. What about Zc? This was defined 


via (10.28) in terms of the quantity 


anf! 
dq? 


(10.57) 


2 


j- pam 
V=Msn,C 


However, equation (10.55) shows that the divergent part of u? is independent of q?, or 
equivalently that the quantity (10.57) is finite. It follows that Z is finite in this theory. 
In other theories, quantities analogous to (10.55) might contain a q?-dependent divergence, 
which would be formally absorbed in the rescaling represented by Zc. 

We may also analyse the vertex correction G [2] of figure 10.6, and conclude that it too is 
finite, because there are now three propagators giving six powers of k in the denominator, 
with still only a four-dimensional d*k integration. Once again, the analogous vertex correc- 
tion in QED is divergent, as we shall see in chapter 11; there too this divergence can be 
absorbed into a redefinition of the physical charge. The ABC theory is, in fact, a ‘super- 
renormalizable’ one, meaning (loosely) that it has fewer divergences than might be expected. 
We shall come back to the classification of theories (renormalizable, non-renormalizable, and 
super-renormalizable) at the end of the following chapter. 

While it is not our purpose to present a full discussion of one-loop renormalization in 
the ABC theory (because it is not of any direct physical interest) we will use it to introduce 
one more important idea before turning, in the next chapter, to one-loop QED. 


10.4 Bare and renormalized perturbation theory 
10.4.1 Reorganizing perturbation theory 


We have seen that, of the one-loop effects listed at the end of section 10.2, the mass shifts 
given by equations such as (10.14) do involve formal divergences as A — 00, but the vertex 
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correction and field strength renormalization are finite in the ABC theory. We shall find 
that in QED the corresponding quantities are all divergent, so that the perturbative re- 
placement of all Lagrangian parameters by their ‘physical’ counterparts, together with field 
strength renormalizations, is mandatory in QED in order to get rid of In A terms. However, 
this process—of evaluating the connections between the two sets of parameters, and then 
inserting them into all the calculated amplitudes—is likely to be very cumbersome. In this 
section, we shall introduce an alternative formulation, which has both calculational and 
conceptual advantages. 

By way of motivation, consider the QED analogue of the divergent part of equation 
(10.7), which contributes a correction to the bare electron mass of the form amIn(A/m) 
where m is the electron mass. At A = 100 GeV the magnitude of this is about 0.04 MeV 
(if we take m to have the physical value), which is a shift of some 10%. The application of 
perturbation theory would seem more plausible if this kind of correction were to be included 
from the start, so that the ‘free’ part of the Hamiltonian (or Lagrangian) involved the 
physical fields and parameters, rather than the (unobserved) ones appearing in the original 
theory. Then the main effects, in some sense, would already be included by the use of these 
(empirical) physical quantities, and corrections would be ‘more plausibly’ small. This is 
indeed the main reason for the usefulness of such ‘effective’ parameters in the analogous 
case of condensed matter physics. Actually, of course, in quantum field theory the corrections 
will be just as infinite (if we send A to infinity) in this approach also, since whichever way 
we set the calculation up, we shall get loops, which are divergent. All the same, this kind 
of ‘reorganization’ does offer a more systematic approach to renormalization. 

To illustrate the idea, consider again our ABC Lagrangian 


Ê = Ĉo,A + Lop + Lo.c + la (10.58) 


where N i : ; 
Loc = ôL" bc — impo (10.59) 


and similarly for Lo, A; Lop; and where 


Lint = — ga dBc. (10.60) 


There are two obvious moves to make: (i) introduce the rescaled (renormalized) fields 
by 
donala) = Z P hlo) (10.61) 


in order to get rid of the vZ; factors in the S-matrix elements and (ii) introduce the physical 
masses mane Consider first the non-interacting parts of £L, namely 
Lo = Loa + Low + Loc. (10.62) 
Singling out the C-parameters for definiteness, Êo can then be written as 
Lo = $2ZcOubpn,cO" bpn,c — 5MEZoGon.c see 
= 5Onbpn,cO" dph,c = Emn oP oh,C 
+3(Ze — 1)Oudpn,cO"Gpn,c — 3(MSZc — Mon.) Pon,0 ++ (10.63) 


Lopn,o + {46Zc0,¢pn,cO" bph,c = 5(SZom, ¢ + 5meZc) Gen ot Tes 
(10.64) 
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FIGURE 10.10 


Counter term corresponding to the terms in braces in (10.64). 


where Loph,c is the standard free-C Lagrangian in terms of the physical field and mass, which 
leads to a Feynman propagator i/(k? — m2}, c +ie) in the usual way; also, ôZc = Zo — 1 and 
mb = mè — mên c In (10.64) the dots signify similar rearrangements of Loa and Lop. 
Note that Zc and me are understood to depend on A, as usual, although this has not been 
indicated explicitly. 

We now regard ‘Loph, At Loph.B + Lopn,c’ as the ‘unperturbed’ part of L, and all the 
remainder of (10.64) as perturbations additional to the original Lj, (much of theoretical 
physics consists of exploiting the identity ‘a+ b = (a + c) + (b—c)’). The effect of this 
rearrangement is to introduce new perturbations, namely 1 §ZcOudpn,cO" bph,c and the 
cae term in (10.64), together with similar terms for the A and B fields. Such additional 
perturbations are called counter terms and they must be included in our new perturbation 
theory based on the Loph,i pieces. As usual, this is conveniently implemented in terms of 
associated Feynman diagrams. Since both of these counter terms involve just the square 
of the field, it should be clear that they only have non-zero matrix elements between one- 
particle states, so that the associated diagram has the form shown in figure 10.10, which 
includes both these C-contributions. Problem 10.5 shows that the Feynman rule for figure 
10.10 is that it contributes i[6Zcok? — (Zema c + 6m%Zc)] to the 1 C + 1 C amplitude. 


The original interaction term Live may also be rewritten in terms of the physical fields 
and a physical (renormalized) coupling constant gpn: 


-goadsbo = —g(ZaZeZc)!?” ph, A@ph,BÊph,C 
= —Gphhph,APph,BPph,c — (Zv — 1)gphÝph,A Pph,BGph,C 
(10.65) 
where 
Zv gph = g(Za Zg Zo)". (10.66) 


The interpretation of (10.66) is clearly that ‘gpn’ is the coupling constant describing the 
interactions among the Êph.i fields, while the ‘(Zy — 1)’ term is another counter term, 
having the structure shown in figure 10.11. 

In summary, we have reorganized L so as to base perturbation theory on a part de- 
scribing the free renormalized fields (rather than the fields in the original Lagrangian); in 
this formulation we find that, in addition to the (renormalized) ABC-interaction term, fur- 
ther terms have appeared which are interpreted as additional perturbations, called counter 
terms. These counter terms are determined, at each order in this (renormalized) pertur- 
bation theory, by what are basically self-consistency conditions—such as, for example, the 
requirement that the propagators really do reduce to the physical ones at the ‘mass-shell’ 
points. We shall now illustrate this procedure for the C propagator. 
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FIGURE 10.11 
Counter term corresponding to the ‘(Zy — 1)’ term in (10.66). 


10.4.2 The O(g2n) renormalized self-energy revisited: how counter 
terms are determined by renormalization conditions 


Let us return to the calculation of the C propagator, following the same procedure as in 
section 10.1, but this time ‘perturbing’ away from Lop i and including the contribution 
from the counter term of figure 10.10, in addition to the O(g>n) self energy. The expression 
(10.14) will now be replaced by 


i 


2 2 2 2 2 [2] 2 3 (10.67) 
g? -mnc ra Zc dZoMpn.c — ôm Zc =I, cl@ , A?) 
where 
dík i i 
inl (o2. A2) = (i a ines 
Mpal A) = (igph) (27)4 k? — m?n a tie (q—k)? — mi, p tie ( ) 


and where we have indicated the cut-off dependence on the left-hand side, leaving it under- 


stood on the right. Comparing (10.68) with (10.39) we see that they are exactly the same, 


except that Hy 


expected in this renormalized perturbation theory. In particular, IT 


exactly the same way as nb, as the cut-off A goes to infinity. 
The essence of this ‘reorganized’ perturbation theory is that we now determine ôZc 
and ôm from the condition that as q? > m2) q, the propagator (10.67) reduces to 


c involves the ‘physical’ coupling constant gp, and the physical masses, as 


P] & will be divergent in 


i/(q?- mao) i.e. it correctly represents the physical C propagator at the mass-shell point, 
with standard normalization. Expanding ul ata") about q? = marc then, we reach the 


approximate form of (10.67), valid for q? ~ m*), c: 


i 


— (10.69) 
2 2 2 [2] 2 2 2 2 diene 
(= mpn, c) Zc-ôme Zo- Iyn, c (mMin, A ja(g = minc) dq? ae 
P=Mn,0 
Requiring that this has the form i/(q? — m2), c) gives 

condition (a) ome, = m e iA?) 
g” 

condition (b Zo s1-— 10.70 
dq? 2 2 
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Looking first at condition (b), we see that our renormalization constant Zc has, in this 
approach, been determined up to O(g?) by an equation that is, in fact, very similar to 
(10.28), but it is expressed in terms of physical parameters. As regards (a), since Zc = 
1+ O(g>y), it is sufficient to replace it by 1 on the right-hand side of (a), so that, to this 


order, ôm ~ -IH .(m2,,¢,A?). Once again, this is similar to (10.56), but written in 


terms of the physical quantities from the outset. We indicate that these evaluations of Zc 
and 6méz are correct to second order by adding a superscript, as in ze l 

Of course, we have not avoided the infinities (in the limit A — oo) in this approach! It 
is still true that the loop integral in Mls diverges logarithmically and so the mass shift 


(mf)? is infinite as A — oo. Nevertheless, this is a conceptually cleaner way to do the 
business. It is called ‘renormalized perturbation theory’, as opposed to our first approach 
which is called ‘bare perturbation theory’. What we there called the ‘Lagrangian fields and 
parameters’ are usually called the ‘bare’ ones; the ‘renormalized’ quantities are ‘clothed’ by 
the interactions. 

We may now return to our propagator (10.67), and insert the results (10.70) to obtain the 
final important expression for the C propagator containing the one-loop O( Gn) renormalized 
self-energy: 


10.71 
a — mên o — Tpncl@?) n 
ph,C ph,C q 


where 
[2] 
donc 
dq? 


l2] 
Moncla) = = na ale , A?) — my co (mn co, A’) (a? môn o) (10.72) 


jaa 
q =M bh, C 


We remind the reader that I! a ola 2A?) has exactly the same form as ne I(q2, A2) except 


that g? and m? are replaced by In and me, From (10.55) it then follows ‘that, as A > ov, 


TH 


2 
g Joh g 
ee = on nA = (mo—1)-4 a dz In A(z, @’), (10.73) 
and hence 
2 1 2 

2 Iph A(x, q°) 
ny A ad u cm? ph, aA?) = 5 | dx nf eee (10.74) 

’ ph, 


which is finite as A — oo. It is also clear from (10.73) that arr?) Dh; c/dq? is finite as A —> oo. 


Thus the quantity TH o(a?) is finite as A — oo, and is understood to be evaluated in 
that limit; the subtraction in (10.74) has removed the infinity. The additional subtraction 


in (10.72) would in fact have removed a logarithmic divergence in Zc, had there been one. 


Note that the form of (10.72) guarantees that the leading behaviour of T ut 2) near 


q? = ma ¢ is (q? — m%, c)’, so that the behaviour of (10.71) near the mass-shell point is 
indeed i/(q? — m®), c) as desired. 
A succinct way of summarizing our final renormalized result (10.71), with the defini- 


tion (10.72), is to say that the C propagator may be defined by (10.71) where the O(Gn) 


renormalized self-energy eee satisfies the renormalization conditions 


a2 d —[] 
Tete = me, co) —0 dqzuenc() = 0. (10.75) 


an) 
gS oh 
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FIGURE 10.12 
O(g*) contribution to A + B —> A + B, involving a propagator correction inserted in an 
external line. 


Relations analogous to (10.75) clearly hold for the A and B self-energies also. In this def- 
inition, the explicit introduction and cancellation of large-A terms has disappeared from 
sight, and all that remains is the importation of one constant from experiment, mn. and 
a (hidden) rescaling of the fields. It is useful to bear this viewpoint in mind when considering 
more general theories, including ones that are ‘non-renormalizable’ (see section 11.8 of the 
following chapter). 

There is a lot of good physics in the expression (10.71), which we shall elucidate in 
the realistic case of QED in the next chapter. For the moment, we just whet the reader’s 
appetite by pointing out that (10.71) must amount to the prediction of a finite, calculable 
correction to the Yukawa 1 — C exchange potential, which after all is given by the Fourier 
transform of the (static form of) the propagator, as we learned long ago. In the case of QED, 
this will amount to a calculable correction to Coulomb’s law, due to radiative corrections, 
as we shall discuss in section 11.5.1. 

There is an important technical implication we may draw from (10.75). Consider the 
Feynman diagram of figure 10.12 in which a propagator correction has been inserted in 
an external line. This diagram is of order Ion? and should presumably be included along 
with the others at this order. However, the conditions (10.75)—in this case written for 


m a~ imply that it vanishes. Omitting irrelevant factors, the amplitude for figure 10.12 is 
[2] 1 1 


Tyna (Pa) = (10.76) 
oe P4 — mena oe mac 


and we need to take the limit på > bn, a Since the external A particle is on-shell. Ex- 


panding ited a about the point på =m), 4 and using conditions (10.75) for C — A we see 
that (10.76) vanishes. Thus with this definition, propagator corrections do not need to be 
applied to external lines. 


10.5 Renormalizability 


We have seen how divergences present in self-energy loops like figure 10.7(a) can be elimi- 
nated by supposing that the ‘bare’ masses in the original Lagrangian depend on the cut-off 
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(b) 


FIGURE 10.13 
(a) O(g*) one-loop contribution to A+B — A+B; (b) counter term that would be required 
if (a) were divergent. 


in just such a way as to cancel the divergences, leaving a finite value for the physical masses. 
The latter are, however, parameters to be taken from experiment: they are not calculable. 
Alternatively, we may rephrase perturbation theory in terms of renormalized quantities 
from the outset, in which case the loop divergence is cancelled by appropriate counter 
terms; but again the physical masses have to be taken from experiment. We pointed out 
that, in the ABC theory, neither the field strength renormalizations Z; nor the vertex dia- 
grams of figure 10.5 were divergent, but we shall see in the next chapter that the analogous 
quantities in QED are divergent. These divergences too can be absorbed into redefinitions 
of the ‘physical’ fields and a ‘physical’ coupling constant (the latter again to be taken from 
experiment). Or, again, such divergences can be cancelled by appropriate counter terms in 
the renormalized perturbation theory approach. 

In general, a theory will have various divergences at the one-loop level, and new diver- 
gences will enter as we go up in order of perturbation theory (or number of loops). Typically, 
therefore, quantum field theories betray sensitivity to unknown short-distance physics by 
the presence of formal divergences in loops, as a cut-off A — oo. In a renormalizable theory, 
this sensitivity can be systematically removed by accepting that a finite number of param- 
eters are incalculable, and must be taken from experiment. These are the suitably defined 
‘physical’ values of the masses and coupling constants appearing in the Lagrangian. Once 
these parameters are given, all other quantities are finite and calculable, to any desired 
order in perturbation theory—assuming, of course, that terms in successive orders diminish 
sensibly in size. 

Alternatively, we may say that a renormalizable theory is one in which a finite number 
of counter terms can be so chosen as to cancel all divergences order by order in renormalized 
perturbation theory. Note, now, that the only available counter terms are the ones which 
arise in the process of ‘reorganizing’ the original theory in terms of renormalized quantities 
plus extra bits (the counter terms). All the counter terms must correspond to masses, 
interactions, etc which are present in the original (or ‘bare’) Lagrangian—which is, in fact, 
the theory we are trying to make sense of! We are not allowed to add in any old kind of 
counter term—if we did, we would be redefining the theory. 

We can illustrate this point by considering, for example, a one-loop (O(g*)) contri- 
bution to AB > AB scattering, as shown in figure 10.13(a). If this graph is divergent, 
we will need a counter term with the structure shown in figure 10.13(b) to cancel the 
divergence—but there is no such ‘contact’ AB — AB interaction in the original theory (it 
would have the form 3 («)%,(a)). In fact, the graph is convergent, as indicated by the 
usual power-counting (four powers of k in the numerator, eight in the denominator from 
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the four propagators). And indeed, the ABC theory is renormalizable—or rather, as noted 
earlier, ‘super-renormalizable’. 

We shall have something more to say about renormalizability and non-renormalizability 
(is it fatal?), at the end of the following chapter. The first and main business, however, will 
be to apply what we have learned here to QED. 


| 
Problems 

10.1 Carry out the indicated change of variables so as to obtain (10.4) from (10.3). 

10.2 Verify the Feynman identity (10.40). 

10.3 Obtain (10.42) from (10.41). 

10.4 Obtain (10.51) from (10.50), having replaced the upper limit of the u-integral by A. 


10.5 Obtain the Feynman rule quoted in the text for the sum of the counter terms appearing 
in (10.64). 


11 


Loops and Renormalization II: QED 


The present electrodynamics is certainly incomplete, but is no longer certainly incorrect. 
F J Dyson (1949b) 


We now turn to the analysis of loop corrections in QED. As we might expect, a theory with 
fermionic and gauge fields proves to be a tougher opponent than one with only spinless 
particles, even though we restrict ourselves to one-loop diagrams only. 

At the outset we must make one important disclaimer. In QED many loop diagrams 
diverge not only as the loop momentum goes to infinity (‘ultraviolet divergence’) but also 
as it goes to zero (‘infrared divergence’). This phenomenon can only arise when there are 
massless particles in the theory—for otherwise the propagator factors ~(k? — M?)~1 will 
always prevent any infinity at low k. Of course, in a gauge theory we do have just such 
massless quanta. Our main purpose here is to demonstrate how the ultraviolet divergences 
can be tamed and we must refer the reader to Weinberg (1995, chapter 13), or to Peskin 
and Schroeder (1995, section 6.5), for instruction in dealing with the infrared problem. 
The remedy lies, essentially, in a careful consideration of the contribution, to physical cross 
sections, of amplitudes involving the real emission of very low frequency photons, along 
with infrared divergent virtual photon processes. It is a ‘technical’ problem, having to do 
with massless particles (of which there are not that many), whereas ultraviolet divergences 
are generic. 


11.1 Counter terms 


We shall consider the simplest case of a single fermion! of bare mass mo and bare charge eo 
(eo > 0) interacting with the Maxwell field, for which the bare (i.e. actual!) Lagrangian is 


1 
2o 


according to chapter 7. We shall adopt the ‘renormalized perturbation theory’ approach 
and begin by introducing field strength renormalizations via 


Ps a a A TEA. la A x 
L= polig — mo) Po = eopoy” poAon — qh ow Fo” (ð i Âo)? (11.1) 


ù = By bo (11.2) 
Ae = ZAE (11.3) 
where the ‘physical’ fields and parameters will now simply have no ‘0’ subscript. This will 
lead to a rewriting of the free and gauge-fixing part of (11.1): 
1 
2go 


1Recall that the SM has three charged leptons (e, u, and T) with identical QED interactions. 


= a La X 7 
poli — mo) Wo — gionto (8 - Ao)? 
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—__ PPADS 
(a) (b) (c) 


FIGURE 11.1 
Counter terms in QED: (a) fermion mass and wavefunction; (b) photon wavefunction; (c) 
vertex part. 


= id ae = ae 7 na Ay 
+|(Z2 — Ibid — dmab}] — 4 (Z3 — DF BMY (11.4) 


where = o/Z3 and 6m = MoZ2 — m (compare (10.64)). We see the emergence of the 
expected ‘y...w’ and ‘Ff’. F” counter terms in (11.4), affecting both the fermion and the 
gauge-field propagators. Next, we write the interaction in terms of a physical e, and the 
physical fields, together with a compensating third counter term: 


-eoho bo Aon = -e hÂ, — (Z1 — Ley Â, (11.5) 
where, with the aid of (11.2) and (11.3), 
Zie = e0222”. (11.6) 


The three counter terms are represented diagrammatically as shown in figures 11.1(a), (b), 
and (c), for which the Feynman rules are, respectively, 


(a): i[f(Z2 — 1) — ôm] 
(b): —i(gt”k? — k#k’)(Z3 —1) (11.7) 
(e): — ie” (Zı — 1). 


These counter terms will compensate for the ultraviolet divergences of the three elementary 
loop diagrams of figure 11.2, and in fact they are sufficient to eliminate all such divergences 
in all QED loops. 

Before proceeding further we remark that we already have a first indication that renor- 
malizing a gauge theory presents some new features. Consider the two counter terms in- 
volving Zə — 1 and Z; — 1; their sum gives 


Wli(Zo — 1) — e(Zı — 1) Al (11.8) 


which is not of the ‘gauge principle’ form ‘if— eA’! Unless, of course, Z; = Z2. This relation 
between the two quite different renormalization constants is, in fact, true to all orders in 
perturbation theory, as a consequence of a Ward identity (Ward 1950), which is itself a 
consequence of gauge invariance. We shall discuss the Ward identity and Z1 = Zə at the 
one loop level in section 11.6. 
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(a) 


(œ) 


(b) 


FIGURE 11.2 
Elementary one-loop divergent diagrams in QED. 


11.2 The O(e?) fermion self-energy 


] 


In analogy with in? , the amplitude corresponding to figure 11.2(a) is the fermion self- 


energy —id!?] where 
d+k 


Ap) = (ie)? f eo oars (11.9) 


and we have now chosen the gauge € = 1. As expected, the d*k integral in (11.9) diverges 
for large k—this time more seriously than the integral in nË, because there are only three 
powers of k in the denominator of (11.9) as opposed to four in (10.7). Once again, we need 
to choose some form of regularization to make (11.9) ultraviolet finite. We shall not be 
specific (as yet) about what choice we are making, since whatever it may be the outcome 
will be qualitatively similar to the nË case. 

There is, however, one interesting new feature in this (fermion) case. As previously indi- 
cated, power-counting in the integral of (11.9) might lead us to expect that—if we adopt a 
simple cut-off—the leading ultraviolet divergence of XP] would be proportional to A rather 
than In A. This is because we have that one extra power of k in the numerator and XP! 
has dimensions of mass. However, this is not so. The leading p-independent divergence is, 
in fact, proportional to m1n(A/m). The reason for this is important and it has interesting 
generalizations. Suppose that m in (11.4) were set equal to zero. Then, as we saw in prob- 
lem 9.4, the two helicity components wy and wR of the fermion field will not be coupled by 
the QED interaction. It follows that no terms of the form vbr or dav can be generated, 
and hence no perturbatively-induced mass term, if m = 0. The perturbative mass shift must 
be proportional to m and therefore, on dimensional grounds, only logarithmically divergent. 

There is also a p-dependent divergence of the self-energy, of which warning was given in 
section 10.3.2. As in the scalar case, this will be associated with the field strength renormal- 
ization factor Z2. It is proportional to pln(A/m) (Z2 is the coefficient of J in (11.8), which 
leads to p in momentum space). The upshot is that the fermion propagator, including the 
one-loop renormalized self-energy, is given by 


i 


Jm) (11.10) 
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where (cf (10.74)) 


=i 3 gi. dul 
a ad ale ae 


Whatever form of regularization is used, the twice-subtracted SV! will be finite and inde- 
pendent of the regulator when it is removed. In terms of the ‘compensating’ quantities Z2 
and mo — m, we find (problem 11.1, cf (10.70)) 
dy] 
Z.=1+ mo —m = -Z3 DPI (p = m). (11.12) 
dp pam 


Note that, as in the case of me, the definition (11.11) of XP] implies that propagator 
corrections vanish for external (on-shell) fermions. The quantities Z2 and mo determined 
by (11.12) now carry a superscript ‘{2]’ to indicate that they are correct at O(e7). 

We must now remind the reader that, although we have indeed eliminated the ultraviolet 
divergences in Sl by the subtractions of (11.11), there remains an untreated infrared 
divergence in dy?! /dp. To show how this is dealt with would take us beyond our intended 
scope, as explained at the start of the chapter. Suffice it to say that by the introduction 
of a ‘regulating’ photon mass pi”, and consideration of relevant real photon processes along 
with virtual ones, these infrared problems can be controlled (Weinberg 1995, Peskin and 
Schroeder 1995). 


(11.11) 


11.3 The O(e?) photon self-energy 
The amplitude corresponding to figure 11.2(6) is ing) where 


im2l(q) = (Die)? T f aa rE r -up mW (11.13) 


_ 2 | dtk Tr[(¢+ k+m)yp(k+m)w] 
(2n)4 [(q +k)? — m2][k?2 —m?] ` 


Once again, this photon self-energy is analogous to the scalar particle self-energy of chap- 
ter 10. There are two new features to be commented on in (11.14). The first is the overall 
‘—1 factor, which occurs whenever there is a closed fermion loop. The keen reader may like 
to pursue this via problem 11.2. The second feature is the appearance of the trace symbol 
‘Tr’: this is plausible as the amplitude is basically a ly — ly one with no spinor indices, 
but again the reader can follow that through in problem 11.3. 


(11.14) 


We now want to go some way into the calculation of nP because it will, in the end, 
contain important physics—for example , corrections to Coulomb’s law. The first step is 
to evaluate the numerator trace factor using the theorems of section 8.2.3. We find (prob- 
lem 11.4) 

Tr[(4 + K + m)y (K + m)yy] = At (qu T ky )kv + (Ww T ky ky 
-guv((0> K) +k? — m?)}. (11.15) 


We then use the Feynman identity (10.40) to combine the denominators, yielding 


1 1 1 
[(q + k)? — m?][k? — m?] =| a ae (11.16) 
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where k’ = k + xq, Ay = —x(1 — x)q? + m° (note that A, is precisely the same as A of 
(10.43) with m4 = mg = m) and we have reinstated the implied ‘ie’. Making the shift to 
the variable k’ in the numerator factor (11.15) produces a revised numerator which is 


4{2kl ki, — gu(k? — A,) — 2a(1 — £) (ququ — Jurg’) + terms linear in k’} (11.17) 


where the terms linear in k’ will vanish by symmetry when integrated over k’ in (11.14). 
Our result so far is therefore 


1 dék’ 2k! k! 
(42) — e2 piv Suv 
tbe Ce) ) 4e da TE (27r)4 -= = Ry + ie)? (k? — A, + = \ 


ad x(1— zx) 
+8e? (guv — Juv?) Ji de | Soar -F (11.18) 


Consider now the ultraviolet divergences of (11.18), adopting a simple cut-off as a regu- 
larization. The terms in the first line are both apparently quadratically divergent, while the 
integral in the second line is logarithmically divergent. What counter terms do we have to 
cancel these divergences? The answer is that the ‘(Z3 — 1)’ counter term of figure 11.1(b) is 
of exactly the right form to cancel the logarithmic divergence in the second line of (11.18), 
but we have no counter term proportional to the g,,, term in the first line. Note, incidentally, 
that we can argue from Lorentz covariance (see appendix D) that 


ath! kiki 
£ = f(A UT 
/ (On) (h7—A, +e (A) Su» oN 


so that taking the dot product of both sides with g"”, we deduce that 


| dtk’ Oki a ath! ke guy ET 
(Qm)4 (k2 A, Fid? 2j 2r) k2- A, + ie? 


It follows that both the terms in the first line of (11.18) produce a divergence of the form 
~A*g,,, and they do not cancel, at least in our simple cut-off regularization. 

A term proportional to g,, is, in fact, a photon mass term. If the Lagrangian included 
a mass term for the photon it would have the form } am „Iv Âb ae. which after introducing 


the rescaled Â, will generate a counter term pipodional to Juv ÂH ÂY , and an associated 
Feynman amplitude proportional to gv. But such a term m2 violates gauge invariance! 
(It is plainly not invariant under (7.69).) Evidently the simple momentum cut-off that we 
have adopted as a regularization procedure does not respect gauge invariance. We saw in 


section 8.6.2 that gauge invariance implied the condition 
qT, =0 (11.21) 


where q is the 4-momentum of a photon entering a one-photon amplitude 7),. Our discussion 
of (11.21) was limited in section 8.6.2 to the case of a real external photon, whereas the 
photon lines in im?) are internal and virtual; nevertheless it is still true that gauge invariance 
implies (Peskin and Schroeder 1995, section 7.4) 

ght = gu? = 0. (11.22) 
Condition (11.22) is guaranteed by the tensor structure (ququ — Jug”) of the second line 
n (11.18), provided the divergence is regularized. As previously implied, a simple cut-off 
A suffices for this term, since it does not alter the tensor structure, and the A-dependence 
can be compensated by the ‘Z3 — 1’ counter term which has the same tensor structure (cf 
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figure 11.2(6)). But what about the first line of (11.18)? Various gauge-invariant regular- 
izations have been used, the effect of all of which is to cause the first line of (11.18) to 
vanish. The most widely used, since the 1970s, is the dimensional regularization technique 
introduced by ’t Hooft and Veltman (1972), which involves the ‘continuation’ of the num- 
ber of space-time dimensions from four to d (< 4). As d is reduced, the integrals tend to 
diverge less, and the divergences can be isolated via the terms which diverge as d — 4. 
Using gauge-invariant dimensional regularization, the two terms in the first line of (11.18) 
are found to cancel each other exactly, leaving just the manifestly gauge invariant second 
line (see appendix O of volume 2). 


We proceed to the next step, renormalizing the gauge-invariant part of in? (q?). 


11.4 The O(e?) renormalized photon self-energy 


The surviving (gauge-invariant) term of m2) is 


a a(1— zx) 
B2 = 
(g) = 8e (quay — 9° our) yf de f iW? A, +e? (11.23) 


= i(¢ Juv — qua UL?! (q? ). (11.24) 


The d*k’ integral in (11.23) is exactly the same as the one in (10.42), with A replaced by 
A,. It contains a logarithmic divergence, which we regulate va ri by a simple cut-off 


A, so that we are dealing with the gauge-invariant quantity M lq 2 A?). The calculation 
leading to (10.55) then tells us that, as A — oo, 


2 1 1 
nea) =-S f dred —2) {mA + (m2-1)- nb. (11.25) 


The analogue of (10.11) is then (in the gauge € = 1) 


guy —ig : o o —igov 
a + Zo -i(q?g?” — q?q? OP (q, A”) - E 
—ig + o o —igor 
t Ura — g?q° IY I(q?, A?) - -2 
2,7) Ton rm?! A? —19nv 
ilg” — q q”) OR (g? A”) - ace 
igw _ —ig —ig - 
= Bee 4, he pen g?, 82) + Se Pe Pe (l(a? A?) + 
(11.26) 
where 3 
PE = gf — = 
q 
and 
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FIGURE 11.3 
One-loop corrected photon propagator connected to a charged lepton vertex. 


(i.e. the 4 x 4 unit matrix). It is easy to check (problem 10.5) that PP P7 = P?. Hence the 
series (11.26) becomes 


= -4 —Sue palig A?) An (m1?) (97, 42)? +- J 
q q 


_ guy Tiue por + TIP (g?, A?) + (P(g, A2))? +. J + Jur pp 


q? q? éj ëd q? 
= Che = Wud/@) i Gu dv (11 27) 
a- nRa, A) PF 


after summing the geometric series, exactly as in (10.11)—(10.14). 
But we have forgotten the counter term of figure 11.1(6), which contributes an amplitude 


—i(g"”q? — gq’)(Z3 — 1). This has the effect of replacing IË! in (11.27) by HÊ! — (Zs — 1) 
and we arrive at the form 


3 _ 2 j 
AC sis ) Laue, (11.28) 
q*(Z3 -IF (P, A) 2 9 


Now in any S-matrix element, at least one end of this corrected propagator will connect 
to an external charged particle line via a vertex of the form j#(p,p’) (cf (8.98) and (8.99) 
for example), as in figure 11.3. But, as we have seen in (8.100), current conservation implies 


dude (p,p) = 0. (11.29) 


Hence the parts of (11.28) with ququ factors will not contribute to physical scattering am- 
plitudes, and our O(e”) corrected photon propagator effectively takes the simple form 


= (11.30) 
g?(Zs — 1P! (q2, A2)) 


We must now determine Z3 from the condition (just as for the C propagator) that (11.30) 
has the form —ig y/q? as q? — 0 (the mass-shell condition). This gives 


ZP! = 1 + OP (0, A?) (11.31) 
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FIGURE 11.4 
The contribution of a massless particle to the photon self-energy. 


the superscript on Z3 indicating as usual that it is an O(e?) calculation as evidenced by the 
e? factor in (11.18). We note from equation (11.25) that I1?!(0, A?) contains a In A part, so 
that this time the field renormalization constant Z3 diverges when the cut-off is removed. 

Inserting (11.31) into (11.30) we obtain the final important expression for the y- 
propagator including the one-loop renormalized self-energy (cf (10.71)): 


o Iw (11.32) 
20 — P! (42)) 


where 


F22 A2 A2 2 2 
DP (q?) = IP (g, A?) — 1P (0, A?). (11.33) 
Equation (11.25) then leads to the result 


lg m? 
11.34 
mi -2f da z(1 ein | z ga JG (11.34) 


which was first given by Schwinger (1949a). This ‘once-subtracted’ nm? l is finite as A > co, 


and tends to zero as q? + 0. 
The generalization of (11.32) to all orders will be given by 


—iguy 
PU- IL,(@)) ie 


where IL, (q?) is the all-orders analogue of IP in (11.32), and is similarly related to the 1-y 
( 


irreducible photon self-energy Ts via the aA of (11.24): 
ip(a?) = ilg? guv — aav) (0°). (11.36) 
Because Tii and hence Iy, has no 1-y intermediate states, it is expected to have no 


contribution of the form A?/gq?. If such a contribution were present, (11.35) shows that it 
would result in a photon propagator having the form 


—ig io 
z a (11.37) 


which is, of course, that of a massive particle. Thus, provided no such contribution is present, 
the photon mass will remain zero through all radiative corrections. It is important to note, 
though, that gauge invariance is fully satisfied by the ee form (11.36) relating T, to 
18 it does not prevent the occurrence of such an ‘A? /q?” piece in IL. Remarkably, therefore, 
it seems possible, after all, to have a massive photon while repele gauge invariance! 
This loophole in the argument ‘gauge invariance implies mą = 0’ was first pointed out by 
Schwinger (1962). 

Such a 1/q? contribution in i must, of course, correspond to a massless single particle 
intermediate state, via a diagram of the form shown in figure 11.4. Thus if the theory 
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contains a massless particle, not the photon (since 1-7 states are omitted from Tay) but 
coupling to it, the photon can acquire mass. This is one way of understanding the ‘Higgs 
mechanism’ for generating a mass for a gauge-field quantum while still respecting the gauge 
symmetry (Englert and Brout (1964), Higgs (1964), Guralnik et al. (1964)). The massless 
particle involved is called a ‘Goldstone boson’. As we shall see in volume 2, just such a 
photon mass is generated in a superconductor, and a similar mechanism is invoked in the 
SM to give masses to the WË and Z° gauge bosons, which mediate the weak interactions. 


11.5 The physics of TI?! (q?) 


We now consider some immediate physical consequences of the formulae (11.32) and (11.34). 


11.5.1 Modified Coulomb’s law 


In section 1.3.3 we saw how, in the static limit, a propagator of the form —g2,(q? + m?,)~! 
could be interpreted (via a Fourier transform) in terms of a Yukawa potential 

-92 e77/a 
4r r 


where a = mz‘ (in units h = c = 1). As my — 0 we arrive at the Coulomb potential, 

associated with the propagator ~1/q? in the static (qo = 0) limit. It follows that the 

corrected propagator (11.32) must represent a correction to the 1/r Coulomb potential. 
To see what it is, we expand the denominator of (11.32) so as to write (11.32) as 


igu F2] (2 
z + 11 (q?)) (11.38) 
which is in fact the perturbative O(a) correction to the propagator (we shall return to 
(11.32) in a moment). At low energies, and in the static limit, q? = —q? will be small 
compared to the fermion (mass)? in (11.34), and we may expand the logarithm in powers 
of q?/m?, with the result that the static propagator becomes (problem 11.6) 


Suu (1+ S¢?/m’?) 11.39 
fue (14 2 g?/m (11.39) 
dw, . a 1 


=o eae 


(11.40) 


The Fourier transform of the first term in (11.40) is proportional to the familiar coulombic 
1/r potential (see appendix G, for example), while the Fourier transform of the constant 
(q?-independent) second term is a 6-function: 


3 
feta = §(r). (11.41) 


When (11.40) is used in any scattering process between two charged particles, each charged 
particle vertex will carry a charge e (or —e) and so the total effective potential will be (in 


the attractive case) 
a 4o? 3 
-{24 Bm? OE (11.42) 
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The second term in (11.42) may be treated as a perturbation in hydrogenic atoms, taking 
m now to be the electron mass me. Application of first-order perturbation theory yields an 
energy shift 
(1) 4a? x 3 3 
AE, = EaI Ypnlr)ó (r)ýn(r) d°r 


= ——— |Yn(0)|?. (11.43) 


Only s-state wavefunctions are non-vanishing at the origin, where they take the value (in 
hydrogen) 


1 sam, 3/2 
„(0 = = ( c) 11.44 
w0) == (7 (11.44) 
where n is the principal quantum number. Hence for this case 
4ařm 
AE® =— Ey 11.4 
” 157n3 ( 5) 


For example, in the 2s state the energy shift is —1.122x 1077 eV. Although we did not discuss 
the Coulomb spectrum predicted by the Dirac equation in chapter 3, it turns out that the 
2751 and 2?P: levels are degenerate if no radiative corrections (such as the previous one) 
are applied. In fact, the levels are found experimentally to be split apart by the famous 
‘Lamb shift’, which amounts to AF /2rh = 1058 MHz in frequency units. The shift we 
have calculated, for the 2s level, is —27.13 MHz in these units, so it is a small—but still 
perfectly measurable—contribution to the entire shift. This particular contribution was first 
calculated by Uehling (1935). 

While small in hydrogen and ordinary atoms, the ‘Uehling effect’ dominates the radiative 
corrections in muonic atoms, where the ‘m’ in (11.44) becomes the muon mass mp. This 
means that the result (11.45) becomes 


4a® May : 
—-——~ | — ]} m,. 
157n® \ me J 
Since the unperturbed energy levels are (in this case) proportional to m,,, this represents a 
relative enhancement of ~(m,,/™me)? ~ (210). This calculation cannot be trusted in detail, 
however, as the muonic atom radius is itself ~1/210 times smaller than the electron radius 


in hydrogen, so that the approximation |q| ~ 1/r < me, which led to (11.42), is no longer 
accurate enough. Nevertheless the order of magnitude is correct. 


11.5.2 Radiatively induced charge form factor 


This leads us to consider (11.38) more generally, without making the low q? expansion. In 
chapter 8 we learned how the static Coulomb potential became modified by a form factor 
F(q?) if the scattering centre was not point-like, and we also saw how the idea could be 
extended to covariant form factors for spin-0 and spin-4 particles. Referring to the case of 
e ys scattering for definiteness (section 8.7), we may consider the effect of inserting (11.38) 
into (8.182). The result is 


pv _ 
Uk Yuuk {Sa + TI (@)} üp itl: (11.46) 


Referring now to the discussion of form factors for charged spin-4 particles in section 8.8, 
we can share the correction (11.46) equally between the e~ and the y~ vertices and write 


Etk Yuuk > etp ypu (1 + AP (0) ~ eūwyuur (C + $T171(q?)) (11.47) 
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FIGURE 11.5 
Screening of charge in a dipolar medium (from Aitchison 1985). 


for the electron, and similarly for the muon. From (8.208) this means that our ‘radiative 
correction’ has generated some effective extension of the charge, as given by a charge form 
factor Fi (q?) = ii (q2). Note that the condition F (0) = 1 is satisfied since IŻ! (0) = 0. 

In the static case, or for scattering of equal mass particles in the CM system, we have 
q? = —q’ and we may consider the Fourier transform of the function Fı(—q°), to obtain 
the charge distribution. The integral is discussed in Weinberg (1995, section 10.2) and in 
Peskin and Schroeder (1995, section 7.5). The latter authors show that the approximate 
radial distribution of charge is ~e~?”" /(mr)°/?, indicating that it has a range ~}. This is 
precisely the mass of the fermion—anti-fermion intermediate state in the loop which yields 
me l so this result represents a plausible qualitative extension of Yukawa’s relationship 
(1.20) to the case of two-particle exchange. In any case, the range represented by nm? l is of 
order of the fermion Compton wavelength 1/m, which is an important insight; this is why 
we need to do better than the point-like approximation (11.42) in the case of muonic atoms. 


11.5.3 The running coupling constant 


There is yet another way of interpreting (11.38). Referring to (11.46), we may regard 
(g) =e [1 + Te) (11.48) 


as a ‘q?-dependent effective charge’. In fact, it is usually written as a ‘q?-dependent fine 
structure constant’ g 
al) = a{l + Ti?! (a). (11.49) 


The concept of a q?-dependent charge may be startling but the related one of a spatially 
dependent charge is, in fact, familiar from the theory of dielectrics. Consider a test charge 
q in a polarizable dielectric medium, such as water. If we introduce another test charge 
—q into the medium, the electric field between the two test charges will line up the water 
molecules (which have a permanent electric dipole moment) as shown in figure 11.5. There 
will be an induced dipole moment P per unit volume, and the effect of P on the resultant 
field is (from elementary electrostatics) the same as that produced by a volume charge equal 
to —div P. If, as is usual, P is taken to be proportional to E, so that P = xeo E, Gauss’ 
law will be modified from 

div E = Ptree/€0 (11.50) 


to 
div E = (Pree — div P)/€o = Ptree/€o — div(vE) (11.51) 
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FIGURE 11.6 
Effective (screened) charge versus separation between charges (from Aitchison 1985). 


where Pree refers to the test charges introduced into the dielectric. If x is slowly varying as 
compared to E, it may be taken as approximately constant in (11.51), which may then be 


written as 
div E = Pree /€ (11.52) 


where e = (1+ x)eo is the dielectric constant of the medium, €o being that of the vacuum. 
Thus the field is effectively reduced by the factor (1 + y)~! = eo/e. 

This is all familiar ground. Note, however, that this treatment is essentially macroscopic, 
the molecules being replaced by a continuous distribution of charge density — div P. When 
the distance between the two test charges is as small as, roughly, the molecular diameter, 
this reduction—or screening effect—must cease and the field between them has the full 
unscreened value. In general, the electrostatic potential between two test charges qı and q2 
in a dielectric can be represented phenomenologically by 


V(r) = quq2/4re(r)r (11.53) 


where e(r) is assumed to vary slowly from the value e for r >> d to the value €9 for r < d, 
where d is the diameter of the polarized molecules. The situation may be described in terms 
of an effective charge 


q = q/le(r)]"? (11.54) 


for each of the test charges. Thus we have an effective charge which depends on the inter- 
particle separation, as shown in figure 11.6. 

Now consider the application of this idea to QED, replacing the polarizable medium 
by the vacuum. The important idea is that, in the vicinity of a test charge in vacuo, 
charged pairs can be created. Pairs of particles of mass m can exist for a time of the order 
of At ~ h/mc?. They can spread apart a distance of order cAt in this time, i.e. a distance of 
approximately h/mce, which is the Compton wavelength X.. This distance gives a measure of 
the ‘molecular diameter’ we are talking about, since it is the polarized virtual pairs which 
now provide a vacuum screening effect around the original charged particle. The largest 
‘diameter’ will be associated with the smallest mass m, in this case the electron mass. Not 
coincidentally, this estimate of the range of the ‘spreading’ of the charge ‘cloud’ is just what 
we found in section 11.5.2: namely, the fermion Compton wavelength. The longest-range 
part of the cloud will be that associated with the lightest charged fermion, the electron. 

In this analogy the bare vacuum (no virtual pairs) corresponds to the ‘vacuum’ used in 
the previous macroscopic analysis and the physical vacuum (virtual pairs) to the polarizable 
dielectric. We cannot, of course, get outside the physical vacuum, so that we are really always 
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dealing with effective charges that depend on r. What, then, do we mean by the familiar 
symbol e? This is simply the effective charge as r > œo or q? — 0; or, in practice, the charge 
relevant for distances much larger than the particles’ Compton wavelength. This is how our 
q? — 0 definition is to be understood. 

Let us consider, then, how a(q?) varies when q? moves to large space-like values, such 
that —q? is much greater than m? (i.e. to distances well within the ‘cloud’). For |q?| > m? 
we find (problem 11.7) from (11.34) that 


nig?) = £ fin (HL) -$ + omen] (11.55) 


3m m? 


so that our g?-dependent fine structure constant, to leading order in a is 


a(q?) % a h 4. = In S) (11.56) 


for large values of |q?|/m?, where A = exp 5/3. 

Equation (11.56) shows that the effective strength a(q?) tends to increase at large |q?| 
(short distances). This is, after all, physically reasonable. The reduction in the effective 
charge caused by the dielectric constant associated with the polarization of the vacuum 
disappears (the charge increases) as we pass inside some typical dipole length. In the present 
case, that length is m~! (in our standard units A = c = 1), the fermion Compton wavelength, 
a typical distance over which the fluctuating pairs extend. 

The foregoing is the reason why this whole phenomenon is called vacuum polarization, 
and why the original diagram which gave me l is called a vacuum polarization diagram. 

Equation (11.56) is the lowest-order correction to a, in a form valid for |q?| > m?. It 
turns out that, in this limit, the dominant vacuum polarization contributions (for a theory 
with one charged fermion) can be isolated in each order of perturbation theory and summed 
explicitly. The result of summing these ‘leading logarithms’ is 


a 


[1 — (a/3m) In(Q?/Am?)] 


where we now introduce Q? = —q?, a positive quantity when q is a momentum transfer. The 
justification for (11.57)—which of course amounts to the very plausible return to (11.32) 
instead of (11.38)—is subtle, and depends upon ideas grouped under the heading of the 
‘renormalization group’. This is beyond the scope of the present volume, but will be taken 
up again in volume 2. 

Equation (11.57) presents some interesting features. First, note that for typical large 
Q? ~ (50 GeV)?, say, the change in the effective a predicted by (11.57) is quite measurable. 
Let us write 


a(Q?) = for Q? >m’ (11.57) 


E a 
~ 1— Aa(Q?) 


in general, where Aa(Q?) includes the contributions from all charged fermions with mass 
m such that m? < Q?. The contribution from the charged leptons is then straightforward, 
being given by 


a(Q?) (11.58) 


Aoteptons = = S~In(Q?/Am?) (11.59) 
l 


where m, is the lepton mass. Including the e, u, and 7 one finds (problem 11.8) 


Adieptons(Q? = (50 GeV)*) ~ 0.03. (11.60) 
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However, the corresponding quark loop contributions are subject to strong interaction cor- 
rections, and are not straightforward to calculate. We shall not pursue this in detail here, 
noting just that the total contribution from the five quarks u, d, s, c, and b has a value very 
similar to (11.60) for the leptons (see, for example, Altarelli et al. 1989). Including both the 
leptonic and hadronic contributions then yields the estimate 


a(Q? = (50 GeV)7) = 4 x 4+ we. (11.61) 


The predicted increase of a(Q?) at large Q? has been tested by measuring the differential 
cross section for Bhabha scattering, 


eet eet. (11.62) 


We are interested in the contribution from one-photon exchange in the t-channel, which 
will contain the factor a(Q?). To favour this contribution, the CM energy should be well 
beyond the Z° peak in the s-channel (cf figure 9.16). This was the case at the highest LEP 
energy, v/s = 198 GeV, which also allowed large Q? values to be probed. The L3 experiment 
covered the region 1800 GeV? < Q? < 21600 GeV? (Achard et al. 2005). These results, and 
earlier data from L3 (Acciari et al. 2000) and OPAL (Abbiendi et al. 2000), clearly show 
the expected rise in a(Q?) as Q? increases, and are in good quantitative agreement with 
the theoretical prediction of QED (Burkhardt and Pietrzyk (2001)). 

The notion of a q?-dependent coupling constant is, in fact, quite general—for example, 
we could just as well interpret (10.71) in terms of a q?-dependent gô, (4°). Such ‘varying con- 
stants’ are called running coupling constants. Until 1973 it was generally believed that they 
would all behave in essentially the same way as (11.57)—namely, a logarithmic rise as Q? 
increases. Many people (in particular Landau 1955) noted that if equation (11.57) is taken 
at face value for arbitrarily large Q?, then a(Q?) itself will diverge at Q? = Am? exp(37/a). 
Taking m to be the mass of an electron, this is of course an absurdly high energy. Besides, 
as such energies are reached, approximations made in arriving at (11.57) will break down; 
all we can really say is that perturbation theory will fail as we approach such energies. 

While this may be an academic point in QED, it turns out that there is one part of the 
SM where it appears to be highly relevant. This is the ‘Higgs sector’ involving a complex 
scalar field, as will be discussed in volume 2. 

The significance of the 1973 date is that it was in that year that one of the most 
important discoveries in ‘post-QED’ quantum field theory was made, by Politzer (1973) 
and by Gross and Wilczek (1973). They performed a similar one-loop calculation in the 
more complicated case of QCD, which is a ‘non-Abelian gauge theory’ (as is the theory of 
the weak interactions in the electroweak theory). They found that the QCD analogue of 
(11.57) was 


2 asu?) 

lO) = EE Ee- GA] E 
where f is the number of fermion—anti-fermion loops considered, and p is a reference mass 
scale. The crucial difference from (11.57) is the large positive contribution ‘+33’, which 
is related to the contributions from the gluonic self-interactions (non-existent among pho- 
tons). The quantity a,(Q?) now tends to decrease at large Q? (provided f < 16), tending 
ultimately to zero. This property is called ‘asymptotic freedom’ and is highly relevant to 
understanding the success of the parton model of chapter 9, in which the quarks and gluons 
are taken to be essentially free at large values of Q?. This can be qualitatively understood 
in terms of a,(Q?) — 0 for high momentum transfers (‘deep scattering’). The non-Abelian 
parts of the SM will be considered in volume 2, where we shall return again to as(Q?). 
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FIGURE 11.7 
Vacuum polarization insertion in the virtual one-photon annihilation amplitude in ete~ > 


se 


11.5.4 1?! in the s-channel 

We have still not exhausted the riches of ne | (@?). Hitherto we have concentrated on regard- 
ing our corrected propagator as appearing in a t-channel exchange process, where q? < 0. 
But of course it could also perfectly well enter an s-channel process such as ete7 — ppt pu 
(see problem 8.18), as in figure 11.7. In this case, the 4-momentum carried by the photon 
iS q = Det + Pe- = Put + Pu- SO that @ is precisely the usual invariant variable ‘s’ (cf 
section 6.3.3), which in turn is the square of the CM energy and is therefore positive. In 
fact, the process of figure 11.7 occurs physically only for q? = s > 4m2, where m, is the 
muon mass. 

Consider, therefore, our formula (11.34) for q? > 0, that is, in the time-like rather than 
the space-like (q? < 0) region, and with m now equal to me, since the loop fermion is the 
electron. The crucial new point is that the argument [m2 — q?a(1 — x)] of the logarithm 
can now become negative, so that me? must develop an imaginary part. The smallest q? for 
which this can happen will correspond to the largest possible value of the product «(1 —<), 


for 0 < x < 1. This value is $, 
threshold for real creation of an e*e~ pair. 
This is the first time that we have encountered an imaginary part in a Feynman ampli- 


tude which, for figure 11.7 and omitting all the spinor factors, is once again 


and so me l becomes imaginary for q? > 4m2, which is the 


e3 


1 


20 — 0P! (42)) A 


but now q? > 4m3, which is greater than 4m so that T1?!(q2) in (11.64) has an imaginary 


part. There is a good physical reason for this, which has to do with unitarity. This was 
introduced in section 6.2.2 in terms of the relation S St = I for the S-matrix. The invariant 
amplitude M is related to S by Sq = 1 + i(2)46*(p; — pp) Mg (cf (6.102)). Inserting this 
into SST = I leads to an equation of the form (for help see Peskin and Schroeder (1995, 
section 7.3)) 


2ImMg = X MipMxi(2m)*5 (v -5 a) (11.65) 
k 


where ‘>>,’ stands for the phase space integral involving momenta q1, q2,... over the states 
allowed by energy-momentum conservation. This implies that as the energy crosses each 
threshold for production of a newly allowed state, there will be a new contribution to the 
imaginary part of M. This is exactly what we are seeing here, at the ete” threshold. 

It is interesting, incidentally, that (11.65) can be used to derive the relativistic general- 
ization of the optical theorem given in appendix H (note that the right-hand side of (11.65) 
is clearly related to the total cross section for i > k, if i = f). 
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FIGURE 11.8 
One-loop vertex correction for a charged lepton ¢~. 


As regards the real part of me? (q?) in the time-like region, it will be given by (11.57) 


with Q? replaced by q?, or s, for large values of q?. Again, measurements have verified the 
predicted variation of a(q?) in the time-like region (Miyabayashi et al. 1995, Ackerstaff et 
al. 1998, Abbiendi et al. 1999, 2000). 

There is one more ‘elementary’ loop that we must analyse—the vertex correction shown 
in figure 11.8, which we now discuss. We will see how the important relation Z1 = Z 
emerges, and introduce some of the physics contained in the renormalized vertex. 


11.6 The O(e?) vertex correction, and Zı = Z2 


The amplitude corresponding to figure 11.8 is 


ieaoo) = a0!) [ ea, 
i 4 
(ies le pay) (1160) 


where Yu = guo Y7, m is the lepton mass, and re) represents the correction to the standard 
vertex and again € = 1. We find 


; 1 1 1 d*k 
TP (p, p’) = if Bry z m5 7 Ta Ori (11.67) 


The integral is logarithmically divergent at large k, by power counting, and the divergence 
will be cancelled by the Zı counter term of figure 11.1(c). It turns out to be infrared 
divergent also, as was dX[?] / dp. As in the latter case, we leave the infrared problem aside, 
concentrating on the removal of ultraviolet divergences. 

Z, is determined by the requirement that the total amplitude at q = p — p' = 0, for 
on-shell fermions, is just —ieti(p)y,,u(p), this being our definition of ʻe’. Hence we have (at 
O(e?)) 


—ieū(p)T Pl (p, p)u(p) — ieū(p)y„ (Z?! — 1)u(p) = 0 (11.68) 
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and so 


rP (p,p) + yu(ZE) — 1) =0. (11.69) 


The renormalized vertex correction r? may then be defined as 


TB (p,p') = TP (p,p) + (ZP! — yy, = TP (p, p') — TP (p, p) (11.70) 


and in this ‘once-subtracted’ form it is finite, and equal to zero at q = 0. 
We shall consider some physical consequences of rP in a moment, but first we show that 
(at O(e?)) zP = z}, and explain the significance of this important relation. It is, after 
all, at first sight a rather surprising equality between two apparently unrelated quantities, 
one associated with the fermion self-energy, the other with the vertex part. From (11.9) we 
have, for the fermion self-energy, 
1 1 d*tk 
5l (p) = ie f : n =a (11.71) 


One can discern some kind of similarity between (11.71) and (11.67), which can be elucidated 
with the help of a little algebra. 
Consider differentiating the identity (p — m)(p — m)~! = 1 with respect to p“: 


0 = p- mp- m] 


Apr 
o =1 5 o = 
= [o| (=m) + = mpm) 
e 0 = 
= yalp- m+ (p m) Pm) k (11.72) 
It follows that ə 
ape Py = -(p -= m) yp- m) (11.73) 
from which the Ward identity (Ward 1950) follows immediately: 
av 
-a = Tu! Or =P). (11.74) 


Derived here to one-loop order, the identity is, in fact, true to all orders, provided that 
a gauge-invariant regularization is adopted. Note that the identity deals with rl at zero 
momentum transfer (q = p — p’ = 0), which is the value at which e is defined. Note also 
that consistently with (11.74), each of AUP] / Op and rP are both infrared and ultraviolet 
divergent, though we shall only be concerned with the latter. 

The quantities XP! and r? are both O(e”), and contain ultraviolet divergences which 
are cancelled by the O(e?) counter terms. From (11.11) and (11.12) we have 


DPI = DPI — 21 (mo — m) + (p—m)(Zq" — 1) (11.75) 

where ÈP] is finite, and from (11.70) we have 
rp, p) = PP (p,p') — (ZP - 1a (11.76) 
where PP! is finite. Inserting (11.75) and (11.76) into (11.74) and equating the infinite parts 


gives 
zP = Zi, (11.77) 
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This relation is true to all orders (Z, = Z2), provided a gauge-invariant regularization is 
used. It is a very significant relation, as already indicated after (11.8). It shows, first, that 
the gauge principle survives renormalization provided the regularization is gauge invariant. 
More physically, it tells us that the bare and renormalized charges are related simply by (cf 
(11.6)) 


e = eZ”. (11.78) 


In other words, the interaction-dependent rescaling of the bare charge is due solely to 
vacuum polarization effects in the photon propagator, which are the same for all charged 
particles interacting with the photon. By contrast, both Z1 and Zə do depend on the specific 
type of the interacting charged particle, since these quantities involve the particle masses. 
The ratio of bare to renormalized charge is independent of particle type. Hence if a set 
of bare charges are all equal (or ‘universal’), the renormalized ones will be too. But we 
saw in section 2.6 how just such a notion of universality was present in theories constructed 
according to the (electromagnetic) gauge principle. We now see how the universality survives 
renormalization. In volume 2 we shall find that a similar universality holds, empirically, in 
the case of the weak interaction, giving a strong indication that this force too should be 
described by a renormalizable gauge theory. 


11.7 The lepton anomalous magnetic moments and tests of QED 


Returning now to rel, just as in section 11.5.2 we regarded the vacuum polarization cor- 


rection 1 + ane l as a contribution to the fermion’s charge form factor Fı(q?), so we may 
expect that the vertex correction will also contribute to the form factor. Indeed, let us recall 
the general form of the electromagnetic vertex for a spin-4 particle (cf (8.208)): 


eo $ 1 2: : F2(q°) v 
—ieu(p’, s) | Faila Yu + nF Ima u(p, s) (11.79) 
where « is the ‘anomalous’ part of the magnetic moment, i.e. the magnetic moment is 
(eh/2m)(1 + xK), the ‘1’ being the Dirac value calculated in section 3.5. In (11.79), Fı and 


Fa are each normalized to 1 at q? = 0. Our vertex rl contributes to both the charge and 


the magnetic moment form factors; let us call the contributions F” and KFP. Now the Z: 
counter term multiplies y„, and therefore clearly cancels a divergence in FP. Is there also, 
we may ask, a divergence in KFP? 

Actually, KF}! is convergent and this is highly significant to the physics of renormaliza- 
tion. Had it been divergent, we would either have had to abandon the theory or introduce a 
new counter term to cancel the divergence. This counter term would have the general form 


Ko ae 
meom pE”; (11.80) 


it is, indeed, an ‘anomalous magnetic moment’ interaction. But no such term exists in the 
original QED Lagrangian (11.1)! Its appearance does not seem to follow from the gauge 
principle argument, even though it is, in fact, gauge invariant. Part of the meaning of 
the renormalizability of QED (or any theory) is that all infinities can be cancelled by 
counter terms of the same form as the terms appearing in the original Lagrangian. This 
means, in other words, that all infinities can be cancelled by assuming an appropriate cut-off 
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FIGURE 11.9 
Contribution (which is finite) to yy > y7. 


dependence for the fields and parameters in the bare Lagrangian. The interaction (11.80) is 
certainly gauge invariant—but it is non-renormalizable—as we shall discuss further later. 
The message is that, in a renormalizable theory, amplitudes which do not have counterparts 
in the interactions present in the bare Lagrangian must be finite. Figure 11.9 shows another 
example of an amplitude which turns out to be finite and there is no ÂY type of interaction 
in QED (cf figure 10.13 (a) and the attendant comment in section 10.5). 

The calculation of the renormalized F,(q?) and of kF2(q?) is quite laborious, not least 
because three denominators are involved in the rP integral (11.67). The dedicated reader 
can follow the story in section 6.3 of Peskin and Schroeder (1995). The most important 
result is the value obtained for «, the QED-induced anomalous magnetic moment of the 
lepton, first calculated by Schwinger (1948a). He obtained 


k= — x 0.001 1614 (11.81) 
27 
which means a g-factor corrected from the g = 2 Dirac value to 


g=2+ (11.82) 
or, equivalently, 
CEE x ~ 0.0011614. (11.83) 
T 


Note that since « is a dimensionless quantity, it cannot depend on the mass m of the internal 
lepton in (11.66). As we will see, contributions from two-loop (and higher) diagrams can 
involve different leptons in internal lines, and can depend on lepton mass ratios. 

The prediction (11.83) may be compared with the experimental values which are, for 
the electron (Workman et al. 2022) 


te expt = [(Ge — 2)/2lexpt = 115 965 218 0.76 (0.28) x 1077? (11.84) 
and for the muon (Workman et al. 2022) 
Gp,expt = [(9u — 2)/2lexps = 116 592 061 (41) x 107", (11.85) 


Of course, in Schwinger’s day the experimental accuracy was far different, but there was a 
real discrepancy (Kusch and Foley 1947) with the Dirac value (a = 0). Schwinger’s one-loop 
calculation provided a fundamental early confirmation of QED, and was the start of a long 
confrontation between theory and experiment which still continues. An extensive review is 
provided by Jegerlehner and Nyffeler (2009), dealing mainly with a,,. 
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FIGURE 11.10 
Two two-loop graphs contributing to ae. (a) A two-photon exchange graph. (b) A muon 
loop inserted in the photon propagator of figure 11.8. 


The extraordinarily precise values in (11.84) and (11.85) represent the results of ever 
more sophisticated and imaginative experimentation. In considering the confrontation of 
these numbers with the SM, we might expect ae expt to present the more severe challenge 
to theory, since it is determined to an accuracy some 2250 times better than that of aycxpt- 
Yet the latter is capable of probing the SM more sensitively. To explain why, we need to go 
beyond the one-loop graph of figure 11.8, and consider the contributions of two-loop graphs 
to de, two of which are shown in figure 11.10. The contribution from figure 11.10(a), in 
which the lepton flavour does not change, is independent of the lepton mass (here me), so 
it gives the same result for ae and ap. This contribution is of order (a/7) times the lowest 
order contribution (11.83). But the graph of figure 11.10(b), in which the lepton in the 
internal bubble has a different flavour from the external lepton, does depend on the lepton 
mass ratio, in this case m./m,,. The internal muon is in fact some 200 times more massive 
than the external electron, and an important decoupling theorem due to Appelquist and 
Carazzone (1975) tells us that such a contribution, due to a heavy internal particle, will 
be suppressed by a factor of order the square of the (light /heavy) mass ratio—in this case, 
(me/Mm,)? ~ 10~°—telative to the contributions of graphs like those in figure 11.10(a). (A 
similar graph with an internal 7 bubble will be suppressed by (m_./m,)?.) The important 
general conclusion is that contributions to ae and a, from loop graphs with a heavy internal 
particle X, of mass mx >> Me, Mp, will be suppressed by a factor of (me/mx)? for ae, and 
of (m,,/mx)* for a,. This implies that the contribution from ‘beyond the SM physics’ 
(represented by a mass scale mx) to a, would be enhanced by a factor (m,,/™Me)? ~ 43, 000 
relative to its contribution to a,. This outweighs by a factor of 19 the greater experimental 
accuracy Of de, expt- 

Turning now to the SM calculation of ae, we may distinguish three contributions: 


Qe theory = e, theory (QED) ate ae, theory (weak) T ae,theory (hadronic). (11.86) 


Hitherto we have considered only the first contribution. Representative diagrams contribut- 
ing to the second and third contributions are shown in figures 11.11(a) and 11.11(b). As 
regards figure 11.11(a), we may make a rough estimate of its contribution as follows. In 
the electroweak theory of GSW, the coupling strength of the leptons to Z° is of the same 
order of magnitude as the charge e, so we expect a factor (a/r) as in (11.83). But the 
graph has a heavy internal particle Z°, leading to a suppression factor of (me/mz)?, and 
a resulting amplitude of order (a/7)(me/mz)? ~ 0.03 x 1071?. Calculations confirm this 
estimate, which is well below the experimental error in ae expt- The hadronic polarization 
graph of figure 11.11(b) (and similar higher order ones) is harder to calculate, and the value 
1.693 x 1071? is given by Aoyama et al. (2019). This is just on the edge of significance, and 
is not the limiting factor in comparing theory with experiment at present. 
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hadronic 


FIGURE 11.11 
‘Beyond QED’ contributions to a£ theory (L = e, u) due to (a) weak and (b) strong interaction 
corrections. 


The conclusion is therefore that ae, theory is essentially given by the pure QED contri- 
bution. This has now been calculated up to tenth order in e (i.e. (a/7)°) (Aoyama et al. 
2019)?. To compare with experiment, a very accurate value of the fine structure constant 
is required. Different ways of measuring a give slightly different results. A measurement 
using the recoil frequency of cesium-135 atoms in a matter-wave interferometer (Parker et 
al. 2018) gave the result 

a~t (Cs) = 137.035999046(27). (11.87) 


With this value of a, the prediction for de theory is (Aoyama et al. 2019) 
Ge,theory = 1159652181.606(229)(11)(12) x 107!” (11.88) 


where the first, second, and third uncertainties are due to the fine structure constant, 
numerical evaluation of the tenth order QED terms, and the hadronic contribution. The 
theory is therefore in excellent agreement with experiment to an extraordinary level of 
accuracy. The QED part of the SM is indeed the paradigm quantum field theory. 

Moving on to a, theory, We Shall summarize the present situation, as reviewed by Hocker 
and Marciano (2022), where the reader will find full references to the original work. The 
pure QED part a, theory(QED) has been evaluated to five loop order. Using the value of 
a! from (11.87) leads to 


au, theory (QED) = 116584718.93(0.10) x 1071? (11.89) 


where the small error results mainly from uncertainties in the estimate of the six loop 
contribution, and in the value of a. One loop contributions to a, theory (weak) are suppressed 
by at least a factor of (a/m)(m,/mw)? ~ 4 x 107%, relative to the leading term (11.83). 
Two loop contributions have been evaluated, and three loop contributions are negligible. 
The quoted result for weak SM contribution is 


Gys,theory (weak) = 153.6(1.0) x 1071". (11.90) 


The limiting factor in comparing theory with experiment is the uncertainty in calculating 
Gy,theory (hadronic).The lowest order (LO) contributions to this, shown in figure 11.11(b), 


2Of course, as emphasized in this reference, behind this latest number lie the efforts of many scientists 
over a period of some seventy years. 
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will have four powers of e, and so will be of order a?. A representative value for these 
contributions is (Aoyama et al. 2020) 


au, theory (hadronic, LO) = 6931(40) x 107". (11.91) 
Higher order hadronic terms are quoted (Höcker and Marciano 2022) as 
Gy:,theory (hadronic, higher order) = 6(18) x 107". (11.92) 
Adding together (11.90), (11.90), (11.91), and (11.92) gives the SM prediction 
Gy,theory = 116591810(1)(40)(18) x 107" (11.93) 


where the errors are due to the weak, lowest order hadronic, and higher order hadronic 
contributions respectively. It is worth stressing that all of the SM (electromagnetic, weak 
and strong theories) is needed for the result (11.93); it is also interesting that the theoretical 
error is essentially the same as the experimental one, at this stage. 

The difference between experiment and theory is 


Quexpt — ap,theory = 252(41)(43) x 1074 (11.94) 


where the errors (closely similar) are from experiment (41) and theory (43) (with all theory 
errors combined in quadrature). Equation(11.94) represents an interesting but not conclusive 
discrepancy of 4.20. This discrepancy has persisted for some years now. The experimental 
value given in (11.85) is that of the 2021 FNAL measurement (Abi et al. 2021), which 
increased the previous discrepancy of 3.70. More recently, the same group has reported 
(Aguillard et al. 2023) further improvements in the precision of their result, which is now 
given as 

Gy expt = 116592055(24) x 10-™, (11.95) 


which brings the discrepancy with (11.93) to 5c. 

However, much depends on the theoretical calculation of the hadronic LO contribution. 
The value (11.91) is derived (Davier et al. 2020; Keshavarzi, Nomura and Teubner 2020; 
Colangelo, Hoferichter and Stoffer 2019; Hoferichter, Hoid and Kubis 2019) from dispersion 
integrals combined with measurements of the cross-section for electron-positron annihilation 
into hadrons (the quantity R of section 9.5). An entirely independent approach, using ab 
initio simulations of lattice gauge theory (see chapter 16) has reported the value (Borsanyi 
et al. 2021) 

a Nee ry (hadronic, LO) = 7075(56) x 107"*. (11.96) 
This larger value substantially reduces the discrepancy with ap, expt- The final conclusion 
appears to require the resolution of the theoretical discrepancy between (11.91) and (11.96). 

No doubt this epic confrontation between theory and experiment will continue to be 
pursued. It is a classic example of the way in which a very high-precision measurement in a 
thoroughly ‘low-energy’ area of physics (a magnetic moment) can have profound impact on 
the ‘high-energy’ frontier—a circumstance upon which we may be increasingly dependent. 

One conclusion we can certainly draw is that renormalizable quantum field theories are 
the most predictive theories we have. We end this volume with some general reflections on 
renormalizable and non-renormalizable theories. 


11.8 Which theories are renormalizable—and does it matter? 


In the course of our travels thus far, we have met theories which exhibit three different types 
of ultraviolet behaviour. In the ABC theory at one-loop order, we found that both the field 


292 Loops and Renormalization IT: QED 


strength renormalizations and the vertex correction were finite; only the mass shifts diverged 
as A — co. The theory was called ‘super-renormalizable’. In QED, we needed divergent 
renormalization constants Z; as well as an infinite mass shift—but (although we did not 
attempt to explain why) these counter terms were enough to cure divergences systematically 
to all orders and the theory was renormalizable. Finally, we asserted that the anomalous 
coupling (11.80) was non-renormalizable. In the final section of this volume, we shall try to 
shed more light on these distinctions and their significance. 

Is there some way of telling which of these ultraviolet behaviours a given Lagrangian is 
going to exhibit, without going through the calculations? The answer is yes (nearly), and 
the test is surprisingly simple. It has to do with the dimensionality of a theory’s coupling 
constant. We have seen (section 6.3.1) that the dimensionality of ‘g’ in the ABC theory is 
Mt (using mass as the remaining dimension when h = c = 1), that of e in QED is M° 
(section 7.4) and that of the coefficient of the anomalous coupling Vow oF HY in (11.80) 
is M~t. These couplings have positive, zero, and negative mass dimension, respectively. It 
is no accident that the three theories, with different dimensions for their couplings, have 
different ultraviolet behaviour and hence different renormalizability. 

That coupling constant dimensionality and ultraviolet behaviour are related can be 
understood by simple dimensional considerations. Compare, for example, the vertex correc- 
tions in the ABC theory (figure 10.6) and in QED (figure 11.8). These amplitudes behave 
essentially as 


dîk 
2 2 
and Pa 
raz a 11. 
e PER (11.98) 


respectively, for large k. Both are dimensionless, but in (11.97) the positive (mass)? dimen- 
sion of Ion is compensated by one additional factor of k? in the denominator of the integral, 
as compared with (11.98), with the result that (11.97) is ultraviolet convergent but (11.98) 
is not. The analysis can be extended to higher-order diagrams: for the ABC theory, the more 
powers of gph which are involved, the more denominator factors are necessary, and hence 
the better the convergence is. Indeed, in this kind of ‘super-renormalizable’ theory, only a 
finite number of diagrams are ultraviolet divergent, to all orders in perturbation theory. 

It is clear that some kind of opposite situation must obtain when the coupling constant 
dimensionality is negative; for then, as the order of the perturbation theory increases, the 
negative powers of M in the coupling constant factors must be compensated by positive 
powers of & in the numerators of loop integrals. Hence the divergence will tend to get worse 
at each successive order. A famous example of such a theory is Fermi’s original theory of 
G-decay (Fermi 1934a, b), referred to in section 1.3.5, in which the interaction density has 
the ‘four-fermion’ form = i z N 

Gry (©) n(x), (x) dv, (2) (11.99) 
where Gp is the ‘Fermi constant’. To find the dimensionality of Gp, we first establish that 


of the fermion field by considering a mass term mir, for example. The integral of this over 
d°z gives one term in the Hamiltonian, which has dimension M. We deduce that hi] = 3, 
since [d?a] = —3. Hence [yy] = 6, and so [Gr] = —2. The coupling constant Gp in 
(11.99) therefore has a negative mass dimension, just like the coefficient K/m in (11.80). 
Indeed, the four-fermion theory is also non-renormalizable. 

Must such a theory be rejected? Let us briefly sketch the consequences of an interaction 
of the form (11.99), but slightly simpler, namely 


Grtb,(x)in(a) dy, (a) br.(a) (11.100) 
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FIGURE 11.12 
Lowest order contribution to Ve+n — Ve+n in the model defined by the interaction (11.100). 


FIGURE 11.13 
Second-order (one-loop) contribution to Ve +n > Ve +n. 


where, for the present purposes, the neutron is regarded as point-like. Consider, for example, 
the scattering process Ve + n —> Ve +n. To lowest order in Gp, this is given by the tree 
diagram—or ‘contact term’—of figure 11.12, which contributes a constant —iG to the 
invariant amplitude for the process, disregarding the spinor factors for the moment. A one- 
loop O(G?) correction is shown in figure 11.13. Inspection of figure 11.13 shows that this 
is an s-channel process (recall section 6.3.3): let us call the amplitude —iGrGP! (s), where 
one Gr factor has been extracted, so that the correction can be compared with the tree 
amplitude and Gel (s) is dimensionless. Then GPl(s) is given by 


d*k i i 
27)4 k- My, (Pu. + Pn — k) = Mn 


GP! (s) = —iGp J 7 (11.101) 
As expected, the negative mass dimension of Gp leaves fewer k-factors in the denominator of 
the loop integral. Indeed, manipulations exactly like those we used in the case of XI] shows 
that GP (s) has a quadratic divergence, and that dq? /ds has a logarithmic divergence. The 
extra denominators associated with second and higher derivatives of GPl(s) are sufficient 
to make these integrals finite. 

The standard procedure would now be to cancel these divergences with counter terms. 


There will certainly be one counter term arising naturally from writing the bare version of 
(11.100) as (cf (11.5)): 


Gor on on on. Pov, = Grn aý. by, + (Za — Grd, baby, by, (11.102) 
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where 24Gp = GorZ2nZ2,,, and the Z’s are the field strength renormalization constants 
for the n and ve fields. Including the tree graph of figure 11.12, the amplitude of figure 11.13, 
and the counter term, the total amplitude to O(G) is given by 


iM = -iGp —iGpG?!(s) — iGp(Z4 — 1). (11.103) 


As in our earlier examples, Z4 will be determined from a renormalization condition. In this 
case, we might demand, for example, that the amplitude M reduces to Gp at the threshold 
value s = sọ, where so = (Mn + mp,)”. Then to O(G2) we find 


Zz?! = 1 — G?l(s0) (11.104) 
and our amplitude (11.103) is, in fact, 
Hai [2]/.) _ cB 
iGp — iGr[G7 (s) — G7 (so)]- (11.105) 


In (11.105), we see the familiar outcome of such renormalization—the appearance of 
subtractions of the divergent amplitude (cf (10.74), (11.11), (11.33), and (11.70)). In fact, 
because aG /ds is also divergent, we need a second subtraction—and correspondingly, a 
new counter term, not present in the original Lagrangian, of the form 


Gaby Ibn, Ib, 


for example; there will also be others, but we are concerned only with the general idea. 
The occurrence of such a new counter term is characteristic of a non-renormalizable theory, 
but at this stage of the proceedings the only penalty we pay is the need to import another 
constant from experiment, namely the value D of ag! /ds at some fixed s, say s = sọ; 
D will be related to the renormalized value of Gg. We will then write our renormalized 
amplitude, up to 0(G#), as 


—iGr[1 + D(s — so) + GP! (s) (11.106) 


where GPl(s) is finite, and vanishes along with its first derivative at s = so; that is, GPl(s) 
contributes calculable terms of order (s — sọ)? if expanded about s = sọ. 

The moral of the story so far, then, is that we can perform a one-loop renormalization of 
this theory, at the cost of taking additional parameters from experiments and introducing 
new terms in the Lagrangian. What about the next order? Figure 11.14 shows a two-loop 
diagram in our theory, which is of order G$. Writing the amplitude as ~iGpG?|(s), the 
ultraviolet behaviour of GP (s) is given by 


dîkıdîk 
ic)? [aS (11.107) 


where k is a linear function of kı and kə. This has a leading ultraviolet divergence ~ Af, 
even worse than that of GPI, As suggested earlier, it is indeed the case that, the higher 
we go in perturbation theory in this model, the worse the divergences become. We can, of 
course, eliminate this divergence in G” by performing a further subtraction, requiring the 
provision of more parameters from experiment. By now the pattern should be becoming 
clear: new counter terms will have to be introduced at each order of perturbation theory, 
and ultimately we shall need an infinite number of them, and hence an infinite number of 
parameters determined from experiment—and we shall have zero predictive capacity. 
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FIGURE 11.14 
A two-loop contribution to Ve +n— Ve +n in the model defined by (11.100). 


Does this imply that the theory is useless? We have learned that GPl(s) produces a 


calculable term of order G(s — so)? when expanded about s = so; and that G will 
produce a calculable term of order G% (s — so)”, and so on. Now, from the discussion after 
(11.99), Gr itself is a dimensionless number divided by the square of some mass. As we 
saw in section 1.3.5 (and will return to in more detail in volume 2), in the case of the 
physical weak interaction this mass in Gp is the W-mass, and Gr ~ a/M¢,. Hence our loop 
corrections have the form a?(s — so)? /M&, a3(s — so)°/M&,.... We now see that for low 
enough energy close to threshold, where (s — so) < MR, it will be a good approximation to 
stop at the one-loop level. As we go up in energy, we will need to include higher-order loops, 
and correspondingly more parameters will have to be drawn from experiment. But only when 
we begin to approach an energy ys ~ Mw/va ~ GP ~ 300 GeV will this theory be 
terminally sick. This was pointed out by Heisenberg (1939). For this argument to work, it 
is important that the ultraviolet divergences at a given order in perturbation theory (i.e. a 
given number of loops) should have been removed by renormalization, otherwise factors of 
A? will enter—in place of the (s — so) factors, for example. 

We have seen that a non-renormalizable theory can be useful at energies well below the 
‘natural’ scale specified by its coupling constant. Let us look at this in a slightly different 
way, by considering the two four-fermion interaction terms introduced at one loop, 


Grbytobyty. and Gab bdndb, bbu. (11.108) 


We know that Gp ~ Mẹ, and similarly Ga ~ My; (from dimensional counting, or from 
the association of the Gq term with the O(G}) counter term). From dimensional analysis, or 
by referring to (11.106) and remembering that D is of order Gp for consistency, we see that 
the second term in (11.108), when evaluated at tree level, is of order (s — s9)/M?, times the 
first. It follows that higher derivative interactions, and in general terms with successively 
larger negative mass dimension, are increasingly suppressed at low energies. 

Where, then, do renormalizable theories fit into this? Those with couplings having pos- 
itive mass dimension (‘super-renormalizable’) have, as we have seen, a limited number of 
infinities and can be quickly renormalized. The ‘merely renormalizable’ theories have dimen- 
sionless coupling constants, such as e (or a). In this case, since there are no mass factors (for 
good or ill) to be associated with powers of a, as we go up in order of perturbation theory it 
would seem plausible that the divergences get essentially no worse, and can be cured by the 
counter terms which compensated those simplest divergences which we examined in earlier 
sections—though for QED the proof is difficult, and took many years to perfect. 

Given any renormalizable theory, such as QED, it is always possible to suppose that 
the ‘true’ theory contains additional non-renormalizable terms, provided their mass scale is 
very much larger than the energy scale at which the theory has been tested. For example, a 
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term of the form (11.80) with ‘K/m replaced by some very large inverse mass M~! would 
be possible, and would contribute an amount of order 4e/M to a lepton magnetic moment. 
The present level of agreement between theory and experiment in the case of the electron’s 
moment implies that M > 4 x 10° GeV. 

From this perspective, then, it may be less of a mystery why renormalizable theories are 
generally the relevant ones at presently posed energies. Returning to the line of thought 
introduced in section 10.1.1, we may imagine that a ‘true’ theory exists at some high en- 
ergy scale A, which can be written in terms of all possible fields and their couplings, as 
allowed by certain symmetry principles. Our particular renormalizable subset of these the- 
ories then emerges as a low-energy effective theory, due to the strong suppression of the 
non-renormalizable terms. Of course, for this point of view to hold, we must assume that 
the latter interactions do not have ‘unnaturally large’ couplings, when expressed in terms 
of A. 

This interpretation, if correct, deals rather neatly with what was, for many physicists, 
an awkward aspect of renormalizable theories. On the one hand, it was certainly an achieve- 
ment to have rendered all perturbative calculations finite as the cut-off went to infinity; but 
on the other, it was surely unreasonable to expect any such theory, established by confronta- 
tion with experiments in currently accessible energy regimes, really to describe physics at 
arbitrarily high energies. On the ‘low-energy effective field theory’ interpretation, we can en- 
joy the calculational advantages of renormalizable field theories, while acknowledging—with 
no contradiction—the likelihood that at some scale ‘new physics’ will enter. 

Just this point of view is now widely accepted as regards the SM itself. In the final 
section of the second volume of this book, we shall introduce the Standard Model Effective 
Field theory (SMEFT), in which the Lagrangian of the Standard Model is supplemented by 
the addition of a set of operators of dimension 6, representing the sub-TeV scale effects of 
interactions occurring at a higher energy. This offers a general framework for parametrizing 
possible deviations from the SM predictions, which may be revealed by precision experiments 
at the LHC, and thus give clues to new physics which lies beyond the Standard Model. 

Having thus argued that renormalizable theories emerge ‘naturally’ as low-energy theo- 
ries, we now seem to be faced with another puzzle: why were weak interactions successfully 
describable, for many years, in terms of the non-renormalizable four-fermion theory? The 
answer is that non-renormalizable theories may be physically detectable at low energies if 
they contribute to processes that would otherwise be forbidden. For example, the fact that 
(as far as we know) neutrinos have neither electromagnetic nor strong interactions, but 
only weak interactions, allowed the four-fermion theory to be detected—but amplitudes 
were suppressed by powers of s/ Mg, (relative to comparable electromagnetic ones) and this 
was, indeed, why it was called ‘weak’! 

In the case of the weak interaction, the reader may perhaps wonder why—if it was 
understood that the four-fermion theory could after all be handled up to energies of order 
10 GeV—so much effort went in to creating a renormalizable theory of weak interactions, as 
it undoubtedly did. Part of the answer is that the utility of non-renormalizable interactions 
was a rather late realization (see, for example, Weinberg 1979). But surely the prospect 
of having a theory with the predictive power of QED was a determining factor. At all 
events, the preceding argument for the ‘naturalness’ of renormalizable theories as low-energy 
effective theories provides strong expectation that such a description of weak interactions 
should exist. 

We shall discuss the construction of the currently accepted renormalizable theory of 
electroweak interactions in volume 2. We can already anticipate that the first step will be 
to replace the ‘negative-mass-dimensioned’ constant Gr by a dimensionless one. The most 
obvious way to do this is to envisage a Yukawa-type theory of weak interactions mediated by 
a massive quantum (as, of course, Yukawa himself did—see section 1.3.5). The four-fermion 
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FIGURE 11.15 
One-Z (Yukawa-type) exchange process in Ve +n —> Ve +n. 


process of figure 11.12 would then be replaced by that of figure 11.15, with amplitude 
(omitting spinors) ~ g#/(q? — m23) where gz is dimensionless. For small q? < m3, this 
reduces to the contact four-fermion form of figure 11.12, with an effective Gp ~ g?/mz, 
showing the origin of the negative mass dimensions of Gp. It is clear that even if the new 
theory were to be renormalizable, many low-energy processes would be well described by 
an effective non-renormalizable four-fermion theory, as was indeed the case historically. 

Unfortunately, we shall see in volume 2 that the application of this simple idea to the 
charge-changing weak interactions does not, after all, lead to a renormalizable theory. This 
teaches us an important lesson; a dimensionless coupling does not necessarily guarantee 
renormalizability. 

To arrive at a renormalizable theory of the weak interactions it seems to be necessary 
to describe them in terms of a gauge theory (recall the ‘universality’ hints mentioned in 
section 11.6). Yet the mediating gauge field quanta have mass, which appears to contradict 
gauge invariance. The remarkable story of how gauge field quanta can acquire mass while 
preserving gauge invariance is reserved for volume 2. 

A number of other non-renormalizable interactions are worth mentioning. Perhaps the 
most famous of all is gravity, characterized by Newton’s constant Gy, which has the value 
(1.2 x 1019 GeV)~?. The detection of gravity at energies so far below 10/9 GeV is due, 
of course, to the fact that the gravitational fields of all the particles in a macroscopic 
piece of matter add up coherently. At the level of the individual particles, its effect is still 
entirely negligible. Another example may be provided by baryon and/or lepton violating 
interactions, mediated by highly suppressed non-renormalizable terms. Such things are 
frequently found when the low-energy limit is taken of theories defined (for example) at 
energies of order 10!° GeV or higher. 

The stage is now set for the discussion, in volume 2, of the renormalizable non-Abelian 
gauge field theories which describe the weak and strong sectors of the SM. 


3The most general renormalizable Lagrangian with the field content, and the gauge symmetries, of the 
Standard Model automatically conserves baryon and lepton number (Weinberg 1996, pp 316-7). 


298 Loops and Renormalization IT: QED 


Problems 
11.1 Establish the values of the counter terms given in (11.12). 


11.2 Convince yourself of the rule ‘each closed fermion loop carries an additional factor 
-1’. 


11.3 Explain why the trace is taken in (11.14). 

11.4 Verify (11.15). 

11.5 Verify the quoted relation P?P7 = P? where P? = g? — q’qu/¢q? (cf (11.26)). 
11.6 Verify (11.39 ) for q? < m?. 

11.7 Verify (11.55 ) for —q? > m?. 

11.8 Check the estimate (11.60). 


11.9 Find the dimensionality of ‘EZ’ in an interaction of the form E(F,,F“”)?. Express this 


interaction in terms of the E and B fields. Is such a term finite or infinite in QED? How 
might it be measured? 
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A 


Non-Relativistic Quantum Mechanics 


This appendix is intended as a very terse ‘revision’ summary of those aspects of non- 
relativistic quantum mechanics that are particularly relevant for this book. A fuller account 
may be found in Mandl (1992), for example. 

Natural units i = c = 1 (see appendix B). 

Fundamental postulate of quantum mechanics: 


(Bi, êz] = i6;;. (A.1) 

Coordinate representation: 
p=-iV A.2 
yla, t) = E9 (A.3 


Ba 47 (A.A) 
2m 
and so i Bb, t) 
ae a x,t 
(- Sm Y + a.) w(a,t) =i ra (A.5) 
Probability density and current (see problem 3.1 (a)): 
p=v"p =y? 20 (A.6) 
Í = galt” (VY) — (Vy) (A.7) 
with 3 
p 
— = u A 
Di +V-j=0 (A.8) 
Free-particle solutions: 
o(a,t) = u(w)e®" (A.9) 
Hou = Eu (A.10) 
where s o 
Hy = H(V =0). (A.11) 
Box normalization: 
l u* (æju(ļx) d?a = 1. (A.12) 
V 


Angular momentum: Three Hermitian operators (Je, dis 1s) satisfying 
[Îr y] = ih, 
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and corresponding relations obtained by rotating the «—y-—z subscripts. [J JA = 0 implies 
complete sets of states exist with definite values of J ? and Jz. Eigenvalues of J are (with 
ħ = 1) j(j+1) where j = 0, 4, 1,...; eigenvalues of J, are m where —j < m < j, for given j. 
For orbital angular momentum, J — L = r x p and eigenfunctions are spherical harmonics 
Yom (0, 6), for which eigenvalues fÊ and L, are I(l +1) and m where —1 < m < l. For 


spin-4 angular momentum, J > io where the Pauli matrices o = (Ox, Cy, 0z) are 


m ol ae. nS 


a (eigenvalue +4), and (o (eigenvalue — 4). 


0 1 
Interaction with electromagnetic field: Particle of charge q in electromagnetic vector 
potential A 


Eigenvectors of s, are 


po p-—qA. (A.14) 
Thus j au) 
— (pm — 2 = i— 
oan D- qA) Y =i Dt (A.15) 
and so 2 dh 
_ -q GT 42,, OV 


Note: (a) chosen gauge V - A = 0; (b) q? term is usually neglected. 

Example: Magnetic field along z-axis, possible A consistent with V - A = 0 is A = 
+B(—y,z,0) such that V x A = (0,0, B). Inserting this into the second term on left-hand 
side of (A.16) gives 


igB o O qB ; 
=--> L; A.17 
2m ( Yon +25) Y 2m y ( ) 
which generalizes to the standard orbital magnetic moment interaction —ĝ - By where 
Bs. 
p= i, (A.18) 
2m 
Time-dependent perturbation theory: 
A = Ĥo +Ý (A.19) 
‘ D 
Hy =i%. A.20 
Unperturbed problem: 
Houn = Enun. (A.21) 
Completeness: 
diet) = X an(t)un (æ) E. (A.22) 
First-order perturbation theory: 
ag = ff da dt ut (a)et PV (a, t)u;(a)e it (A.23) 


which has the form 


ag = =i f (volume element) (final state)* (perturbing potential) (initial state) (A.24) 


Non-Relativistic Quantum Mechanics 


Important examples: 


(i) V independent of t: 
ag = —iVa2rð (Er == Ei) 


where 


Va = [Seu (e)V(@)u(a). 


(ii) Oscillating time-dependent potential: 


(a) if V ~ et, time integral of ag is 
fa et itrte wt, iiit — 2rô( Ep — Ej — w) 


i.e. the system has absorbed energy from potential; 
(b) if V ~ et*, time integral of ag is 


fa etiErtetiote iEit — 2rô( Ep +w — Ej) 


i.e. the potential has absorbed energy from system. 
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(A.25) 


(A.26) 


(A.27) 


(A.28) 


Absorption and emission of photons: For electromagnetic radiation, far from its sources, 


the vector potential satisfies the wave equation 


Solution: 
A(a,t) = Ao exp(—iwt + ik - x) + Aj exp(+iwt — ik - æ). 


With gauge condition V - A = 0 we have 
k : Ao = 0 


and there are two independent polarization vectors for photons. 
Treat the interaction in first-order perturbation theory: 


V (x,t) = (ig/m)A(a,t)-V. 
Thus 


Ay exp(—iwt +ik-a) = absorption of photon of energy w 


Aj exp(+iwt +ik-x) = emission of photon of energy w. 


(A.29) 


(A.30) 


(A.31) 


(A.32) 


(A.33) 


B 


Natural Units 


In particle physics, a widely adopted convention is to work in a system of units, called 
natural units, in which 


h=c=1. (B.1) 


This avoids having to keep track of untidy factors of h and c throughout a calculation; 
only at the end is it necessary to convert back to more usual units. Let us spell out the 
implications of this choice of c and h. 

(i) c= 1. In conventional MKS units c has the value 


c=3x 108 me. (B.2) 
By choosing units such that 
c=] (B.3) 
since a velocity has the dimensions 
[d = (LT (B.A) 


we are implying that our unit of length is numerically equal to our unit of time. In this 
sense, length and time are equivalent dimensions: 


[L] = [T]. (B.5) 
Similarly, from the energy-momentum relation of special relativity 
E? = pe + m2c4 (B.6) 


we see that the choice of c = 1 also implies that energy, mass and momentum all have 
equivalent dimensions. In fact, it is customary to refer to momenta in units of ‘MeV/c’ or 
‘GeV/c’; these all become ‘MeV’ or ‘GeV’ when c = 1. 

(ii) A = 1. The numerical value of Planck’s constant is 


h = 6.6 x 10-7? MeV s (B.7) 
and fi has dimensions of energy multiplied by time so that 
[A] = [MLP T]. (B.8) 


Setting i = 1 therefore relates our units of [M], [L], and [T]. Since [L] and [T] are equivalent 
by our choice of c = 1, we can choose [M] as the single independent dimension for our 
natural units: 


IM] = [L]-* = [T]. (B.9) 


An example: the pion Compton wavelength How do we convert from natural units to 
more conventional units? Consider the pion Compton wavelength 
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evaluated in both natural and conventional units. In natural units 
Ar = 1/Mz (B.11) 


where M, ~ 140 MeV/c?. In conventional units, using M,,h (B.7), and c (B.2), we have 
the familiar result 
Ne = 1.41 fm (B.12) 


where the ‘fermi’ or femtometre, fm, is defined as 
1 fm = 1071" m. 


We therefore have the correspondence 


àr = 1/M, = 1.41 fm. (B.13) 


Practical cross section calculations: An easy-to-remember relation may be derived from 
the result 


fie ~ 200 MeV fm (B.14) 


obtained directly from (B.2) and (B.7). Hence, in natural units, we have the relation 


1 fm 5 (GeV)7?. (B.15) 


1 
~ 200 MeV — 


Cross sections are calculated without h’s and c’s and all masses, energies and momenta typ- 
ically in MeV or GeV. To convert the result to an area, we merely remember the dimensions 


of a cross section: 
[o] = [L]? = [M]. (B.16) 


If masses, momenta and energies have been specified in GeV, from (B.15) we derive the 
useful result (from the more precise relation ic = 197.328 MeV fm) 


i + 2 


where a millibarn, mb, is defined to be 
1 mb = 107°! m?. 
Note that a ‘typical’ hadronic cross section corresponds to an area of about A? where 
àZ = 1/M?2 = 20 mb. 


Electromagnetic cross sections are an order of magnitude smaller: specifically for lowest 


order ete > pt aa 
a & —— nb (B.18) 
s 


where s is in (GeV)? (see problem 8.18(d) in chapter 8). 


C 


Maxwell’s Equations: Choice of Units 


In high-energy physics, it is not the convention to use the rationalized MKS system of units 
when treating Maxwell’s equations. Since the discussion is always limited to field equations 
in vacuo, it is usually felt desirable to adopt a system of units in which these equations take 
their simplest possible form—in particular, one such that the constants €o and uo, employed 
in the MKS system, do not appear. These two constants enter, of course, via the force laws 
of Coulomb and Ampère, respectively. These laws relate a mechanical quantity (force) to 
electrical ones (charge and current). The introduction of eo in Coulomb’s law 


qır 
~ Atregr? cl) 


enables one to choose arbitrarily one of the electrical units and assign to it a dimension 
independent of those entering into mechanics (mass, length and time). If, for example, we 
use the coulomb as the basic electrical quantity (as in the MKS system), €o has dimension 
(coulomb)? [T]?/{M][L]*. Thus the common practical units (volt, ampère, coulomb, etc) 
can be employed in applications to both fields and circuits. However, for our purposes 
this advantage is irrelevant, since we are only concerned with the field equations, not with 
practical circuits. In our case, we prefer to define the electrical units in terms of mechanical 
ones in such a way as to reduce the field equations to their simplest form. The field equation 
corresponding to (C.1) is 


V- E= p/o (Gauss’ law: MKS) (C.2) 


and this may obviously be simplified if we choose the unit of charge such that co becomes 
unity. Such a system, in which CGS units are used for the mechanical quantities, is a variant 
of the electrostatic part of the ‘Gaussian CGS’ system. The original Gaussian system set 
co + 1/47, thereby simplifying the force law (C.1), but introducing a compensating 47 into 
the field equation (C.2). The field equation is, in fact, primary, and the 47 is a geometrical 
factor appropriate only to the specific case of three dimensions, so that it should not appear 
in a field equation of general validity. The system in which ep in (C.2) may be replaced by 
unity is called the ‘rationalized Gaussian CGS’ or ‘Heaviside—Lorentz’ system: 


V E=p (Gauss’ law; Heaviside—Lorentz). (C.3) 


Generally, systems in which the 4r factors appear in the force equations rather than the 
field equations are called ‘rationalized’. 

Of course, (C.3) is only the first of the Maxwell equations in Heaviside—Lorentz units. 
In the Gaussian system, uo in Ampére’s force law 


F= T JJ” x (Jo x 12) dr, ders (C.4) 
T 


3 
fia 


was set equal to 47, thereby defining a unit of current (the electromagnetic unit or Biot 
(Bi emu)). The unit of charge (the electrostatic unit or Franklin (Fr esu)) has already been 
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defined by the (Gaussian) choice co = 1/4m and currents via po —> 4r, and c appears 
explicitly in the equations. In the rationalized (Heaviside—Lorentz) form of this system, 
€o > 1 and uo > 1, and the remaining Maxwell equations are 


Vx E=-12B (C.5) 
V-B=0 (C.6) 
Vx B=j+i9F, (C.7) 


A further discussion of units in electromagnetic theory is given in Panofsky and Phillips 
(1962, appendix I). 

Finally, throughout this book we have used a particular choice of units for mass, length, 
and time such that A = c = 1 (see appendix B). In that case, the Maxwell equations we use 
are as in (C.3), (C.5)-(C.7), but with c replaced by unity. 

As an example of the relation between MKS and the system employed in this book (and 
universally in high-energy physics), we remark that the fine structure constant is written as 


e2 


= A4teghc 


in MKS units (C.8) 


or as 
e2 


At 
Clearly the value of a(~ 1/137) is the same in both cases, but the numerical values of ‘e’ 
in (C.8) and in (C.9) are, of course, different. 


The choice of rationalized MKS units for Maxwell’s equations is a part of the SI system 
of units. In this system of units, the numerical values of uo and €o are 


a= in Heaviside—Lorentz units with h = c= 1. (C.9) 


uo = 4r x 10-7 (kg m C7? = H m‘') 
and, since Hoco = 1/2, 


107 1 
~ Apc? 36r x 109 


€0 (C? s? kg™t m`? =F m™}). 


D 


Special Relativity: Invariance and Covariance 


The co-ordinate 4-vector x” is defined by 


gh = (g a'ar?) 


where x° = t (with c = 1) and (x1, x°, x3) = x. Under a Lorentz transformation along the 
x-axis with velocity v, z” transforms to 


xr” = q(x? — vz") 

a — y(—vr? +x!) 

r = e 

E (D.1) 


where y = (1 — v?) t2, 

A general ‘contravariant 4-vector’ is defined to be any set of four quantities A” = 
(A°, Al, A?, A”) = (A°, A) which transform under Lorentz transformations exactly as the 
corresponding components of the coordinate 4-vector x”. Note that the definition is phrased 
in terms of the transformation property (under Lorentz transformations) of the object being 
defined. An important example is the energy-momentum 4-vector p” = (E, p), where for a 
particle of rest mass m, E = (p?+m?)'/?. Another example is the 4-gradient 0“ = (0°, —V) 


(see problem 2.1) where 
o 0 ð ð 
o0_ 2 E tan 
ea ot i & " Ox?’ iui) f 2) 


Lorentz transformations leave the expression A°? — A? invariant for a general 4-vector A“. 
For example, E? — p? = m? is invariant, implying that the rest mass m is invariant un- 
der Lorentz transformations. Another example is the four-dimensional invariant differential 
operator analogous to V?, namely 


= 822 = y? 
which is precisely the operator appearing in the massless wave equation 
O¢ = 0°74 — Vb = 0. 


The expression A°? — A? may be regarded as the scalar product of A“ with a related 
‘covariant vector’ A, = (A°,—A). Then 


A? _ A? =X AMA, 
H 
where, in practice, the summation sign on repeated ‘upstairs’ and ‘downstairs’ indices is 


always omitted. We shall often shorten the expression ʻA” A,,’ even further, to ‘A?’; thus 
p? = E? — p? = m?. The ‘downstairs’ version of 0“ is ô, = (0°, V). Then 0,0" = 0? = 
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‘Lowering’ and ‘raising’ indices is effected by the metric tensor g” or guv, where g°° = 
goo = 1, gt! = g? = g? = gut = 922 = 933 = —1, all other components vanishing. Thus if 
An = Gu A” then Ap = A®, Ay = —A}, ete. 


In the same way, the scalar product A- B of two 4-vectors is 


A-B=A"B, = A°B°-A-B (D.3) 


and this is also invariant under Lorentz transformations. For example, the invariant four- 
dimensional divergence of a 4-vector j” = (p,7) is 


Oj, =Op—(-V)- fj =O ptV-j=9,5" (D.4) 


since the spatial part of ð” is —V. 

Because the Lorentz transformation is linear, it immediately follows that the sum (or 
difference) of two 4-vectors is also a 4-vector. In a reaction of the type ‘1 +2 —> 3+4+ 
---N’ we express the conservation of both energy and momentum as one ‘4-momentum 
conservation equation’: 

pi +P = ps + py +++ pN: (D.5) 


In practice, the 4-vector index on all the p’s is conventionally omitted in conservation 
equations such as (D.5), but it is nevertheless important to remember, in that case, that 
it is actually four equations, one for the energy components and a further three for the 
momentum components. Further, it follows that quantities such as (p1 + p2)?, (pı — p3)? are 
invariant under Lorentz transformations. 

We may also consider products of the form A“ B”, where A and B are 4-vectors. As 
u and v each run over their four possible values (0,1,2,3), 16 different ‘components’ are 
generated (A°B°, A°B',..., AB). Under a Lorentz transformation, the components of A 
and B will transform into definite linear combinations of themselves, as in the particular 
case of (D.1). It follows that the 16 components of A” B” will also transform into well-defined 
linear combinations of themselves (try it for A°B' and (D.1)). Thus we have constructed 
a new object whose 16 components transform by a well-defined linear transformation law 
under a Lorentz transformation, as did the components of a 4-vector. This new quantity, 
defined by its transformation law, is called a tensor—or more precisely a ‘contravariant 
second-rank tensor’, the ‘contravariant’ referring to the fact that both indices are upstairs, 
the ‘second rank’ meaning that it has two indices. An important example of such a tensor 
is provided by 0“ A” (x) — 0” A” (x), which is the electromagnetic field strength tensor F””, 
introduced in chapter 2. More generally we can consider tensors BY” which are not literally 
formed by ‘multiplying’ two vectors together, but which transform in just the same way; 
and we can introduce third- and higher-rank tensors similarly, which can also be ‘mixed’, 
with some upstairs and some downstairs indices. 

We now state a very useful and important fact. Suppose we ‘dot’ a downstairs 4-vector 
A,, into a contravariant second-rank tensor BY”, via the operation A,, BY”, where as always 
a sum on the repeated index py is understood. Then this quantity transforms as a 4-vector, 
via its ‘loose’ index v. This is obvious if BY” is actually a product such as B” = CHD”, 
since then we have A, BY” = (A-C)D”, and (A- C) is an invariant, which leaves the 4- 
vector D” as the only ‘transforming’ object left. But even if BY” is not such a product, it 
transforms under Lorentz transformations in exactly the same way as if it were, and this 
leads to the same result. An example is provided by the quantity 0,,F"” which enters on 
the left-hand side of the Maxwell equations in the form (2.18). 

This example brings us conveniently to the remaining concept we need to introduce 
here, which is the important one of covariance. Referring to (2.18), we note that it has the 
form of an equality between two quantities (0,,F’” on the left, 7%, on the right) each of 
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which transforms in the same way under Lorentz transformations—namely as a contravari- 
ant 4-vector. One says that (2.18) is ‘Lorentz covariant’, the word ‘covariant’ here meaning 
precisely that both sides transform in the same way (i.e. consistently) under Lorentz trans- 
formations. Confusingly enough, this use of the word ‘covariant’ is evidently quite different 
from the one encountered previously in an expression such as ‘a covariant 4-vector’, where 
it just meant a 4-vector with a downstairs index. This new meaning of ‘covariant’ is actually 
much better captured by an alternative name for the same thing, which is ‘form invariant’, 
as we will shortly see. 

Why is this idea so important? Consider the (special) relativity principle, which states 
that the laws of physics should be the same in all inertial frames. The way in which this 
physical requirement is implemented mathematically is precisely via the notion of covariance 
under Lorentz transformations. For, consider how a law will typically be expressed. Relative 
to one inertial frame, we set up a coordinate system and describe the phenomena in question 
in terms of suitable coordinates, and such other quantities (forces, fields, etc) as may be 
necessary. We write the relevant law mathematically as equations relating these quantities, 
all referred to our chosen frame and coordinate system. What the relativity principle requires 
is that these relationships—these equations—must have the same form when the quantities 
in them are referred to a different inertial frame. Note that we must say ‘have the same 
form’, rather than ‘be identical to’, since we know very well that coordinates, at least, are 
not identical in two different inertial frames (cf (D.1)). This is why the term ‘form invariant’ 
is a more helpful one than ‘covariant’ in this context, but the latter is more commonly used. 

A more elementary example may be helpful. Consider Newton’s law in the simple form 
F = mt. This equation is ‘covariant under rotations’, meaning that it preserves the same 
form under a rotation of the coordinate system—and this in turn means that the physics 
it expresses is independent of the orientation of our coordinate axes. The ‘same form’ in 
this case is of course just F’ = më’. We emphasize again that the components of F” are 
not the same as those of F, nor are the components of 7’ the same as those of #; but the 
relationship between F” and ë’ is exactly the same as the relationship between F and 7, 
and that is what is required. 

It is important to understand why this deceptively simple result (‘F” = më”) has been 
obtained. The reason is that we have assumed (or asserted) that ‘force’ is in fact to be 
represented mathematically as a 3-vector quantity. Once we have said that, the rest follows. 
More formally, the transformation law of the components of r is r; = Rijrj; (sum on j 
understood), where the matrix of transformation coefficients R is ‘orthogonal’ (RR* a 
R”R = 1), which ensures that the length (squared) of r is invariant , r? = r’°. To say 
that ‘force is a 3-vector’ then implies that the components of F transform by the same set 
of coefficients R,;: F! = R,;F;. Thus starting from the law F; = m7; which relates the 
components in one frame, by multiplying both sides of the equation by Rij and summing 
over j we arrive at F; = mi, which states precisely that the components in the primed 
frame bear the same relationship to each other as the components in the unprimed frame 
did. This is the property of covariance under rotations, and it ensures that the physics 
embodied in the law is the same for all systems which differ from one another only by a 
rotation. 

In just the same way, if we can write equations of physics as equalities between quantities 
which transform in the same way (i.e. ‘are covariant’) under Lorentz transformations, we 
will guarantee that these laws obey the relativity principle. This is indeed the case in the 
Lorentz covariant formulation of Maxwell’s equations, given in (2.18), which we now repeat 
here: 0, F”” = j¥%,,. To check covariance, we follow essentially the same steps as in the 
case of Newton’s equations, except that the transformations being considered are Lorentz 
transformations. Inserting the expression (2.19) for F“”, the equation can be written as 
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(0,,0") A” — 0” (Ə A!) = jğn. The bracketed quantities are actually invariants, as mentioned 
earlier. This means that 0,,0“ is equal to 0,,/0'" , and similarly ð, A” = 0,'A’", so that we 
can write the equation as (0/,0'")A” — 8” (8, A'”) = jën. It is now clear that if we apply 
a Lorentz transformation to both sides, A” and 0” will become A’” and 0’”, respectively, 
while j¥„ will become j’%.,, since all these quantities are 4-vectors, transforming the same 
way (as the 3-vectors did in the Newton case). Thus we obtain just the same form of 
equation, written in terms of the ‘primed frame’ quantities, and this is the essence of (Lorentz 
transformation) covariance. 

Actually, the detailed ‘check’ that we have just performed is really unnecessary. All 
that is required for covariance is that (once again!) both sides of equations transform the 
same way. That this is true of (2.18) can be seen ‘by inspection’, once we understand 
the significance (for instance) of the fact that the u indices are ‘dotted’ so as to form an 
invariant. This example should convince the reader of the power of the 4-vector notation 
for this purpose: compare the ‘by inspection’ covariance of (2.18) with the job of verifying 
Lorentz covariance starting from the original Maxwell equations (2.1), (2.2), (2.3), and (2.8)! 
The latter involves establishing the rather complicated transformation law for the fields E 
and B (which, of course, form parts of the tensor F“”). One can indeed show in this way 
that the Maxwell equations are covariant under Lorentz transformations, but they are not 
manifestly (i.e. without doing any work) so, whereas in the form (2.18) they are. 


E 


Dirac 6-Function 


Consider approximating an integral by a sum over strips Ax wide as shown in figure E.1: 
x2 
J fæ)de = X` f(zi)Ar. (E.1) 
gi 7 


Consider the function (x — xj) shown in figure E.2, 


1/Ax in the jth interval 
O(a — zj) = E.2 
eta) K all others ea 
Clearly this function has the properties 

do f(@i)6(wi — 27) Aw = f(z) (E.3) 
and 


S ô(zi — 2;)Ax = 1. (E.4) 


In the limit as we pass to an integral form, we might expect (applying (E.1) to the left-hand 
sides) that these equations to reduce to 


[foe —2)) ax = f(a) (E.5) 
and a 
i ô(x — zj)dz = 1 (E.6) 


provided that zı < £j < x2. Clearly such ‘ð-functions’ can easily be generalized to more 
dimensions, e.g. three dimensions: 


dV = dz dy dz = d?r ô(r — rj) = (x — a,;)d(y — y;)0(2 — zj). (E.7) 


Informally, therefore, we can think of the 6-function as a function that is zero everywhere 
except where its argument vanishes—at which point it is infinite in such a way that its 
integral has unit area, and equations (E.5) and (E.6) hold. Do such amazing functions exist? 
In fact, the informal idea just given does not define a respectable mathematical function. 
More properly the use of the ‘d-function’ can be justified by introducing the notion of 
‘distributions’ or ‘generalized functions’. Roughly speaking, this means we can think of the 
‘d-function’ as the limit of a sequence of functions, whose properties converge to those given 
here. The following useful expressions all approximate the -function in this sense: 


1 
im — ~e/2<a< 

iz) = {fh for —€/2 < x < €/2 (E.8) 

0 for |z| > €/2 

oe | € 
d(z) = Se (E.9) 
(= ie. (E.10) 

N-oo 7 x 
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Se) 


Ax) 


FIGURE E.1 
Approximate evaluation of integral. 


35r-x)) 


1/Ax 


FIGURE E.2 
The function 6(x — zj). 


The first of these is essentially the same as (E.2), and the second is a ‘smoother’ version 
of the first. The third is sketched in figure E.3; as N tends to infinity, the peak becomes 
infinitely high and narrow, but it still preserves unit area. 

Usually, under integral signs, 6-functions can be manipulated with no danger of obtain- 
ing a mathematically incorrect result. However, care must be taken when products of two 
such generalized functions are encountered. 
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FIGURE E.3 
The function (E.10) for finite N. 


Resumé of Fourier series and Fourier transforms 


Fourier’s theorem asserts that any suitably well-behaved periodic function with period L 
can be expanded as follows: 


f(a) = 3 ape (E.11) 


n=— o0 


Using the orthonormality relation 


1 pele F : 
zS e7 2tima/L ?ringz/L dz = Omn (E.12) 


L J_pj2 


with the Kronecker 6-symbol defined by 


1 ifm=n 


the coefficients in the expansion may be determined: 
an, = al fae rr az. (E.14) 
L J-j 
Consider the limit of these expressions as L —> oo. We may write 
f(@e)= >> F,An (E.15) 


with 
F, =a,¢e°""" (E.16) 
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and the interval An = 1. Defining 
2rn/L =k (E.17) 


and 
Lan = g(k) (E.18) 


we can take the limit L — oo to obtain 


fæ) = T F, dn 


© g(kjét? Ldk 
2 l F e, (E.19) 
Thus = 
f(z) = 7 / . g(k)el** dk (E.20) 
and similarly from (E.14) 
g(k) = T f(ax)e"i** dz. (E.21) 


These are the Fourier transform relations, and they lead us to an important representation 
of the Dirac 6-function. 
Substitute g(k) from (E.21) into (E.20) to obtain 


f(x) = =f. dk e*# T da! ei” f(x’). (E.22) 


Reordering the integrals, we arrive at the result 


f(x) = D ase ( 5 L. ener) a) (E.23) 


valid for any function f(x). Thus the expression 


1 og x 1 

— e-r) dk (E.24) 
2m J 66 

has the remarkable property of vanishing everywhere except at x = x’, and its integral with 
respect to x’ over any interval including x is unity (set f = 1 in (E.23)). In other words, 


(E.24) provides us with a new representation of the Dirac 5-function: 


6(x) : l > elt dk. (E.25) 


= = 


Equation (E.25) is very important. It is the representation of the d-function which is 
most commonly used, and it occurs throughout this book. Note that if we replace the upper 
and lower limits of integration in (E.25) by N and —N, and consider the limit N — oo, we 
obtain exactly (E.10). 

The integral in (E.25) represents the superposition, with identical uniform weight (27)~+, 
of plane waves of all wavenumbers. Physically it may be thought of (cf (E.20)) as the 
Fourier transform of unity. Equation (E.25) asserts that the contributions from all these 
waves cancel completely, unless the phase parameter x is zero—in which case the integral 
manifestly diverges and ‘d(0) is infinity’ as expected. The fact that the Fourier transform 
of a constant is a -function is an extreme case of the bandwidth theorem from Fourier 
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transform theory, which states that if the (suitably defined) ‘spread’ in a function g(k) is 
Ak, and that of its transform f(a) is Ax, then AxAk > 4. In the present case Ak is tending 
to infinity and Az to zero. 

One very common use of (E.25) refers to the normalization of plane-wave states. If we 


rewrite it in the form 
j co e-ik'z ekx 


we can interpret it to mean that the wavefunctions e'** /(27)'/? and e!** /(27)'/? are or- 
thogonal on the real axis —oo < x < oo for k # k’ (since the left-hand side is zero), while 
for k = k’ their overlap is infinite, in such a way that the integral of this overlap is unity. 
This is the continuum analogue of orthonormality for wavefunctions labelled by a discrete 
index, as in (E.12). We say that the plane waves in (E.26) are ‘normalized to a 6-function’. 
There is, however, a problem with this as plane waves are not square integrable and thus 
do not strictly belong to a Hilbert space. Mathematical physicists concerned with such 
matters have managed to deal with this by introducing ‘rigged’ Hilbert spaces in which 
such a normalization is legitimate. Although we often, in the text, appear to be using ‘box 
normalization’ (i.e. restricting space to a finite volume V), in practice when we evaluate 
integrals over plane waves, the limits will be extended to infinity, and results like (E.26) 
will be used repeatedly. 
Important three- and four-dimensional generalizations of (E.25) are: 


J ek @ G34 = (27)353(x) (E.27) 


and 
/ ek? d4k = (2r) t(x) (E.28) 


where k- x = k°x® — k-@ (see appendix D), 64(z) = 6(x°)63(x), and (æ) = 
6(a1)5(x?)6(a). 


Properties of the 6-function 


The basic properties of the d-function are exemplified by the equations (see (E.5) and (E.6)) 
L lx —a)dr=1, d(a—a)=0 forza, (E.29) 

where a is any real number; and 
/ Pay be ath d=): (E.30) 


where f(x) is any continuous function of x. Other useful properties follow: 
(i) 
—0(a). (E.31) 


Proof 


For a > 0, 


J Erra | 7 set =: (E.32) 


=o a a 
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for a < 0, 7 
| tais ji iou)” : / sue 2 T (E.33) 
(ii) 
d(x) = 6(—2) i.e. an even function. (E.34) 
Proof 
f(0) = ICONGR (E.35) 


If f(x) is an odd function, f(0) = 0. Thus ô(x) must be an even function. 


(iii) 


1 
6(f(x)) = D ldf /dz|e—a, O(a ai) (E.36) 
where a; are the roots of f(x) = 0. 


Proof 


The 6-function is only non-zero when its argument vanishes. Thus we are concerned with 
the roots of f(x) = 0. In the vicinity of a root 


flai) =0 (E.37) 


we can make a Taylor expansion 


f(v) = (z — a) (4 ) pere, (E.38) 


Thus the 6-function has non-zero contributions from each of the roots a; of the form 


5(f(x)) = » ò fe — ai) (£) E | . (E.39) 


Hence (using property (i)) we have 


1 
6(f(«)) = 2 EPA 6(x — aj). (E.40) 
Consider the example 
6(a? — a°). (E.41) 
Thus 
f(z) =x? — @? = (x — a)(x + a) (E.42) 
with two roots xz = +a (a > 0), and df/dz = 2x. Hence 
E S ie EE E (E.43) 
(iv) 
xôļ(x) = 0. (E.44) 


This is to be understood as always occurring under an integral. It is obvious from the 
definition or from property (ii). 
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[fear =-7'0) (E.45) 
where d 
1 = — 
ô (x)= qê) (E.46) 
Proof 
J f (z)de = -|  fi'(x)d(w) de + [Faila] 
= —p'(0) (E47) 
since the second term vanishes. 
(vi) 
J d(a” — a) da!’ = O(a — a) (E.48) 
where it é 
= or z < 
MaS { 1 forz>0 ee) 
is the so-called 6-function. 
Proof 
For «>a, 
J ô(x' — a) dr’ = 1; (E.50) 
for x < a, i 
/ 6(a’ — a) da’ = 0. (E.51) 
By a simple extension it is easy to prove the result 
[ d(a — a) dx = O(a — a) — O(a, — a). (E.52) 
(vii) 
O(a — y) O(a — z) = (x — y) d(y— z). (E.53) 


Proof 


Take any continuous function of z, f(z). Then 
fa! 7 Fe) defS(e— y) læ- 2)} = f(a) 6(@—y) (E.54) 
= f(y n= | K f(z)dz{8(x — y) ôly — z)}. (E.55) 


Thus the two sides of (vii) are equivalent as factors in an integrand with z as the integration 
variable. 
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Exercise 


Use property (iii) plus the definition of the 6-function to perform the p° integration and 
prove the useful phase space formula 


/ 4p 5(p? — m?)0(p°) = / d°p/2E (E.56) 
where 
pP = (PP -p (E.57) 
and 
E = +p +m’) ?. (E.58) 


The relation (E.51) shows that the expression d*p/2E is Lorentz invariant: on the left-hand 
side, dp and (p? — m?) are invariant, while (p?) depends only on the sign of p°, which 
cannot be changed by a ‘proper’ Lorentz transformation—that is, one that does not reverse 
the sense of time. 


F 


Contour Integration 


We begin by recalling some relevant results from the calculus of real functions of two real 
variables x and y, which we shall phrase in ‘physical’ terms. Consider a particle moving in 
the xy-plane subject to a force F = (P(x,y), Q(x, y)) whose x- and y-components P and 
Q vary throughout the plane. Suppose the particle moves, under the action of the force, 
around a closed path C in the xy-plane. Then the total work done by the force on the 
particle, We, will be given by the integral 


We = $ F-dr= $ Pdr + Qdy (F.1) 
c c 


where the $ sign means that the integration path is closed. Using Stokes’ theorem, we can 
rewrite (F.1) as a surface integral 


We =f curl F dS (F.2) 
S 


where S is any surface bounded by C (as a butterfly net is bounded by the rim). Taking S 
to be the area in the xy-plane enclosed by C, we have dS = dz dy k and 


We = I. (%2 = Z) dz dy. (F.3) 


A mathematically special, but physically common, case is that in which F is a ‘conservative 
force’, derivable from a potential function V (æ, y) (in this two-dimensional example) such 
that 


Play)=-Fe ad Qen- 2 (F4) 
the minus signs being the usual convention. In that case, it is clear that 
OP ƏQ 
eee F.5 
Oy Ox ey) 


and hence We in (F.3) is zero. The condition (F.5) is, in fact, both necessary and sufficient 
for We = 0. 
There can, however, be surprises. Consider, for example, the potential 


V(a,y) = —tan—! y/z. (F.6) 
In this case, the components of the associated force are 


OV -y OV x 
= — = = ——— = ——, F. 
P I eye and Q (F.7) 
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Let us calculate the work done by this force in the case that C is the circle of unit radius 
centred on the origin, traversed in the anticlockwise sense. We may parametrize a point on 
this circle by (x = cos 8, y = sin 0), so that (F.1) becomes 


We = $ — sin 0(— sin 8 d0) + cos 8 (cos 0 d0) = f dé = 27 (F.8) 
€ c 


a result which is plainly different from zero. The reason is that although this force is (minus) 
the gradient of a potential, the latter is not single-valued, in the sense that it does not return 
to its original value after a circuit round the origin. Indeed, the V of (F.6) is just —@, which 
changes by —27 on such a circuit, exactly as calculated in (F.8) allowing for the minus signs 
n (F.4). Alternatively, we may suspect that the trouble has to do with the ‘blow up’ of the 
integrand of (F.7) at the point x = y = 0, which is also true. 

Much of the foregoing has direct parallels within the theory of functions of a complex 
variable z = x+iy, to which we now give a brief and informal introduction, limiting ourselves 
to the minimum required in the text’. The crucial property, to which all the results we need 
are related, is analyticity. A function f(z) is analytic in a region R of the complex plane if 
it has a unique derivative at every point of R. The derivative at a point z is defined by the 
natural generalization of the real variable definition: 


dt jig NETO). 


dz Az-50 Az 


(F.9) 


The crucial new feature in the complex case, however, is that ‘Az’ is actually an (infinites- 
imal) vector, in the xy (Argand) plane. Thus we may immediately ask: along which of 
the infinitely many possible directions of Az are we supposed to approach the point z in 
(F.9)? The answer is: along any! This is the force of the word ‘unique’ in the definition of 
analyticity, and it is a very powerful requirement. 

Let f(z) be an analytic function of z in some region R, and let u and v be the real 
and imaginary parts of f: f = u + iv, where u and v are each functions of x and y. Let us 
evaluate df/dz at the point z = x + iy in two different ways, which must be equivalent. 

(a) By considering Az = Az (i.e. Ay = 0). In this case 


— = lim 
dz Az—0 Ax 
Ou Ov 
= —-+j— F.10 
ba Oe eo 
from the definition of a partial derivative. 
(b) By considering Az = iAy (i.e. Ax = 0). In this case 
df u(x, y + Ay) — u(x, y) + iv(a, y + Ay) — iv(z, y) 
— = lim : 
dz Ay0 iAy 
Ov Ou 
= ——-i—. F.11 
Oy “ay ( ) 
Equating (F.10) and (F.11) we obtain the Cauchy-Reimann (CR) relations 
ðu _ Ov Ou 7 Ov (F.12) 


Ox Oy Oy Ox 


which are the necessary and sufficient conditions for f to be analytic. 


1For a fuller introduction, see for example Boas (1983, chapter 14). 
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Consider now an integral of the form 
I= $ f(z) dz (F.13) 
c 


where again the symbol ¢ means that the integration path (or contour) in the complex 
plane in closed. Inserting f = u+iv and z = x + iy, we may write (F.13) as 


I= p (ude vay) +i wde + udy). (F.14) 


Thus the single complex integral (F.13) is equivalent to the two real-plane integrals (F.14); 
one is the real part of J, the other is the imaginary part, and each is of the form (F.1). In 
the first, we have P = u,Q = —v. Hence the condition (F.5) for the integral to vanish is 
Ou/Oy = —Ov/0zx, which is precisely the second CR relation! Similarly, in the second integral 
in (F.14) we have P = v and Q = u so that condition (F.5) becomes 0v/Oy = Ou/Ox, which 
is the first CR relation. It follows that if f(z) is analytic inside and on C, then 


f Me de=0. (F.15) 
Cc 


a result known as Cauchy’s theorem, the foundation of complex integral calculus. 
Now let us consider a simple case in which (as in (F.7)) the result of integrating a 
complex function around a closed curve is not zero—namely the integral 


dz 


(F.16) 
C z 


where C is the circle of radius p enclosing the origin. On this circle, z = pe!’ where p is fixed 


and 0 < 0 < 27, so 
> i0 d@ 
$= $5 =i f dd = ni (F.17) 
e 2 c pe 


Cauchy’s theorem does not apply in this case because the function being integrated (27+) 
is not analytic at z = 0. Writing dz/z in terms of x and y we have 


dz dx+idy (x—iy) : 
= = d 
z x+iy a aa y) 
= zdz + ydy . [( —y dz + z dy 
= ( ay +i ey ; (F.18) 


The reader will recognize the imaginary part of (F.18) as involving precisely the functions 
(F.7) studied earlier, and may like to find the real potential function appropriate to the real 
part of (F.18). 

We note that the result (F.17) is independent of the circle’s radius p. This means that 
we can shrink or expand the circle how we like, without affecting the answer. The reader 
may like to show that the circle can, in fact, be distorted into a simple closed loop of any 
shape, enclosing z = 0, and the answer will still be 27i. In general, a contour may be freely 
distorted in any region in which the integrand is analytic. 

Before continuing, we note some important terminology. The function z~~ is not analytic 
at z = 0 because its derivative (—1/z?) is not defined at z = 0. A point at which a function 
f(z) is not analytic is called a singularity of f(z). There are several possible types of 
singularity, but for our purposes we are only interested in the simple pole, which is defined 


di 
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as follows. If the limit as z > zo of (z — zo) f(z) is non-zero, then f(z) has a simple pole at 
z = zo. In the present case, the function z~! has a simple pole at z = 0. 

We are now in a position to prove the main integration formula we need, which is 
Cauchy’s integral formula: let f(z) be analytic inside and on a simple closed curve C which 
encloses the point z = a; then 


$ ne dz = 2rif (a) (F.19) 


where it is understood that C is traversed in an anticlockwise sense around z = a. The 
proof follows. The integrand in (F.19) is analytic inside and on C, except at z = a; we may 
therefore distort the contour C by shrinking it into a very small circle of fixed radius p 
around the point z = a. On this circle, z is given by z = a + pe’, and 


2T iĝ \ naib 2T 
F) ae fla + pe” joie" ag _ 


3 f(a + pe'®)i dd. (F.20) 
cz—a 0 pe 


Now, since f is analytic at z = a, it has a unique derivative there, and is consequently 
continuous at z = a. We may then take the limit p — 0 in (F.20), obtaining lim,-.9 f(a + 
pe?) = f(a), and hence 


f(z) 


cez—a 


dz = ro f "40 = 2rif (a) (F.21) 


as stated. 
We now use these results to establish the representation of the 6-function (see (E.47)) 
quoted in section 6.3.2. Consider the function F(t) of the real variable t defined by 


i e izt 
F(t) =— —d F.22 
(t) 2T Pae z+ie Z ( ) 


where € is an infinitesimally small positive number (i.e. it will tend to zero through positive 
values). Note that the integrand in (F.22) has a simple pole at z = —ie. The closed contour 
C is made up of Cı which is the real axis from —R to R (we shall let R — oo at the end), 
and of Cz which is a large semicircle of radius R with diameter the real axis, in either 
the upper or lower half-plane, the choice being determined by the sign of t, as we shall 
now explain (see figure F.1). Suppose first that t < 0, and let z on C2 be parametrized as 
z = Re? = Rcos0 +iRsin0. Then 


eit = elt] = e7 Psin 6|t| pif cos 6|t| (F.23) 
from which it follows that the contribution to (F.22) from C2 will vanish exponentially as 
R —> œ provided that 0 > 0, i.e. we choose C2 to be in the upper half-plane (figure F.1(a)). 


In that case the integrand of (F.22) is analytic inside and on C (the only non-analytic point 
is outside C at z = —ie) and so 


F(t) =0 for t < 0. (F.24) 
However, suppose t > 0. Then 


eit = efsin Ot a7 İR cos Ot (F.25) 
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FIGURE F.1 
Contours for F(t): (a) t < 0; (b) t > 0. 


and in this case we must choose the ‘contour-closing’ Ca to be in the lower half-plane (0 < 0) 
or else (F.25) will diverge exponentially as R — co. With this choice the Cz contribution 
will again go to zero as R — oo. However, this time the whole closed contour C does enclose 
the point z = —ie (see figure F.1(b)), and we may apply Cauchy’s integral formula to get, 
for t > 0, 


i 
F(t) = —2ri— e7“ F.2 
(t) = —2mi—e", (F.26) 


the minus sign at the front arising from the fact (see figure F.1(b)) that C is now being 
traversed in a clockwise sense around z = —ie (this just inverts the limits in (F.21)). Thus 
as € > 0, 

F(t) 1 for t > 0. (F.27) 


Summarizing these manoeuvres, for t < 0 we chose Co in (F.22) in the upper half-plane 
(figure F.1(a)), and its contribution vanished as R —> oo. In this case we have, as R —> oo, 


i co e itt 
F(t) af seo for t < 0. (F.28) 


For t > 0 we chose C2 in the lower half-plane (figure F.1(b)), when again its contribution 
vanished as R — oo. However, in this case F does not vanish, but instead we have, as 
R> œ, 


i oo e itt 
F(t) + 5 foes for t > 0. (F.29) 


Equations (F.28) and (F.29) show that we may indeed write 


i co e7izt 
O(t) = lim — Í dz (F.30) 


e>0 2T J œ Z +ie 


as claimed in section 6.3, equation (6.93). 


G 


Green Functions 


Let us start with a simple but important example. We seek the solution Go(r) of the 
equation 

V’Go(r) = (r). (G.1) 
There is a ‘physical’ way to look at this equation which will give us the answer straightaway. 
Recall that Gauss’ law in electrostatics (appendix C) is 


V- E=p/e (G.2) 


and that E is expressed in terms of the electrostatic potential V as E = —VV. Then (G.2) 
becomes 

VV = —p/60 (G.3) 
which is known as Poisson’s equation. Comparing (G.3) and (G.1), we see that (—Go(r)/eo) 
can be regarded as the ‘potential’ due to a source p which is concentrated entirely at the 
origin, and whose total ‘charge’ is unity, since (see appendix E) 


fior =]; (G.4) 


In other words, (—Go/€o) is effectively the potential due to a unit point charge at the origin. 
But we know exactly what this potential is from Coulomb’s law, namely 


—Go(r) 1 
= G.5 
€0 A4reor ( ) 
whence j 


We may also check this result mathematically as follows. Using (G.6), equation (G.1) is 
equivalent to 


y = —4rô(r). (G7) 


Let us consider the integral of both sides of this equation over a spherical volume of arbitrary 
radius R surrounding the origin. The integral of the left-hand side becomes, using Gauss’ 
divergence theorem, 


| (~) dr =| v. (~Ż) dr =| v G) AAS. (G.8) 
V T Vv r S bounding V T 


i TUDA 1. 
v (=) = T R 


on the surface S, while ù = ê and dS = R? dQ with dQ the element of solid angle on the 


sphere. So 
1 
I v? G) Br = -f dQ = —4r (G.9) 
Vv r S 
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which using (G.4) is precisely the integral of the right-hand side of (G.7), as required. 
Consider now the solutions of 


(V? + k7)G;(r) = d(r). (G.10) 


We are interested in rotationally invariant solutions, for which G; is a function of r = |r| 
alone. For r 4 0, equation (G.10) is easy to solve. Setting G(r) = f(r)/r, and using 


10 4,0 o o 
2 Or gO : BS ce 
=a a a + parts depending on 30 and T 
we find that f(r) satisfies 
Of g f= 
dr? 7 


the general solution to which is (k = |k|) 
f(r) — Aeikr + Be, 
leading to 


eikr —ikr 


G,(r) =A z +B 


(G.11) 
for r # 0. In the application to scattering problems (appendix H) we shall want Gp to 


contain purely outgoing waves, so we will pick the ‘A’-type solution in (G.11). 
Consider therefore the expression 


(V? +k?) (4) (G.12) 


where r is now allowed to take the value zero. Making use of the vector operator result 


V? (fg) =(V*f)g +2VF-Vo+f(V79) 


with ‘f? = e" and ‘g’ = 1/r, together with 


* Laikr 9 ikr 
V2eikr — 2ike keik" Veikr — ikre yt = = 
r r r r 
we find 
aikr 
(V? i k?) (= ) = Ack? G) 
r r 
= —4rAe™" S(r) 
= —47Ad(r) (G.13) 


where we have replaced r by zero in the exponent of the last term of the last line in 
(G.13), since the 6-function ensures that only this point need be considered for this term. 
By choosing the constant A = —1/47, we find that the (outgoing wave) solution of (G.10) 
is 

(G.14) 


We are also interested in spherically symmetric solutions of (restoring c and h explicitly 
for the moment) 


(v m) olr) = 5(r) (G.15) 
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which is the equation analogous to (G.1) for a static classical scalar potential of a field whose 
quanta have mass m. The solutions to (G.15) are easily found from the previous work by 
letting k + imc/h. Retaining now the solution which goes to zero as r — oo, we find 


(G.16) 


4n r 
where a = h/mc, the Compton wavelength of the quantum, with mass m. The potential 
(G.16) is (up to numerical constants) the famous Yukawa potential, in which the quantity 
‘a’ is called the range: as r gets greater than a, 6(r) becomes exponentially small. Thus, 
just as the Coulomb potential is the solution of Poisson’s equation (G.3) corresponding 
to a point source at the origin, so the Yukawa potential is the solution of the analogous 
equation (G.15), also with a point source at the origin. Note that as a + oo, ¢(r) > Go(r). 

Functions such as Gz, Go and ¢, which generically satisfy equations of the form 


Q,G(r) = ô(r) (G.17) 


where Q, is some linear differential operator, are said to be Green functions of the operator 
Qr. From the examples already treated, it is clear that G(r) in (G.17) has the general 
interpretation of a ‘potential’ due to a point source at the origin, when Q, is the appropriate 
operator for the field theory in question. 

Green functions play an important role in the solution of differential equations of the 
type 

Q,0b(r) = s(r) (G.18) 

where s(r) is a known ‘source function’ (e.g. the charge density in (G.3)). The solution of 
(G.18) may be written as 


wr) = u(r) + I G(r — r')s(r') d?r’ (G.19) 


where u(r) is a solution of 2,u(r) = 0. Thus once we know G, we have the solution via 
(G.19). 

Equation (G.19) has a simple physical interpretation. We know that G(r) is the solution 
of (G.18) with s(r) replaced by 6(r). But by writing 


s(r) = fse —r')s(r’) abr’ (G.20) 


we can formally regard s(r) as being made up of a superposition of point sources, dis- 
tributed at points r’ with a weighting function s(r’). Then, since the operator 2, is (by 
assumption) linear, the solution for such a superposition of point sources must be just the 
same superposition of the point source solutions, namely the integral on the right hand side 
of (G.19). This integral term is, in fact, the ‘particular integral’ of the differential equation 
(G.18), while the u(r) is the ‘complementary function’. 

Equation (G.19) can also be checked analytically. First note that it is generally the case 
that the operator Q, is translationally invariant, so that 


= 0 (G21) 


the right-hand side of (G.21) amounts to shifting the origin to the point r’. Applying Qr to 
both sides of (G.19), we find 


Qpr) = u(r) + | YG- rst’) 


0 + [awal —r')s(r') dr’ = fse — r')s(r’) d?r’ 
s(r) 


as required in (G.18). 


328 Green Functions 
Finally, consider the Fourier transform of equation (G.10), defined as 
pear + k*)Gi(r) dr = feito) d?r. 
The right-hand side is unity, by equation (G.4). On the left-hand side we may use the result 
Jv) d?r = [Pe d?r 


(proved by integrating by parts, assuming u and v go to zero sufficiently fast at the bound- 
aries of the integral) to obtain 


J IT? + RG a(r) or = JUTT) + eT} Ger) ar 
= [ce + k?)e IT G,(r) dr 


(—q? + k*)G.(q) 


where G;,(q) is the Fourier transform of G(r). Since this expression has to equal unity, we 


have 
1 


k? — q? 
There is, however, a problem with (G.22) as it stands, which is that it is undefined when 
the variable q? takes the value equal to the parameter k? in the original equation. Indeed, 


various definitions are possible, corresponding to the type of solution in r-space for G(r) 
(i.e. ingoing, outgoing or standing wave). It turns out (see the exercise at the end of this 


(a) = (G.22) 


appendix) that the specification which is equivalent to the solution G(r) in (G.14) is to 
add an infinitesimally small imaginary part in the denominator of (G.22): 


~ 1 
Gog) = ; G.23 
Pa = wae (6.23) 
In exactly the same way, the Fourier transform of ¢(r) satisfying (G.15) is 
iQ) = (6.24) 
DS Pm’ 


where we have reverted to units such that A =c=1. 
The relativistic generalization of this result is straightforward. Consider the equation 


(0 + m?)G(x) = —d(z) (G.25) 


where x is the coordinate 4-vector and 6(a) is the four-dimensional 6-function, 6(x°)6(a); 
the sign in (G.25) has been chosen to be consistent with (G.15) in the static case. Taking the 
four-dimensional Fourier transform, and making suitable assumptions about the vanishing 
of G at the boundary of space-time, we obtain 


(-q? +m?)G(q) =-1 (G.26) 


where 


G(q) = perc dta 
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and so 
z 1 


G(q) = (G.27) 


gm? 
As we shall see in detail in chapter 6, the Feynman prescription for selecting the physically 
desired solution amounts to adding an ‘ie’ term in the denominator of (G.27): 


1 


l 2 
q? — m? + ie (G28) 


Gq) = 


Exercise 


Verify the ‘ie’ specification in (G.23), using the methods of appendix F. [Hints: You need 
to show that the Fourier transform of (G.23), defined by 


: 1 ETET 
COO) = ae | EPa (G.29) 


is equal to GP (r) of (G.14). Do the integration over the polar angles of q, taking the 
direction of r as the polar axis. This gives 


A(-+) oe oo as ela” — eitr qdq 
Gy (r) = = / ( 7 ) Feri (G.30) 


— 0O 


where q = |q|, r = |r|, and we have used the fact that the integrand is an even function 
of q to extend the lower limit to —oo, with an overall factor of 1/2. Now convert q to the 
complex variable z. Locate the poles of (z? — k? — ie)! (compare the similar calculation in 
section 10.3.1, and in appendix F). Apply Cauchy’s integral formula (F.17), closing the e7” 
part in the upper half z-plane, and the e77" part in the lower half z-plane. 


H 


Elements of Non-Relativistic Scattering Theory 


H.1 Time-independent formulation and differential cross section 


We consider the scattering of a particle of mass m by a fixed spherically symmetric potential 
V(r); we shall retain A explicitly in what follows. The potential is assumed to go to zero 
rapidly as r — oo, as for the Yukawa potential (G.16); it will turn out that the important 
Coulomb case can be treated as the a — oo limit of (G.16). We shall treat the problem here 
as a stationary state one, in which the Schrödinger wavefunction 7(r,t) has the form 


(r,t) = (ree (H.1) 


where FE is the particle’s energy, and where ¢(r) satisfies the equation 


ev + vir) olr) = Ed(r). (H.2) 


We shall take V to be spherically symmetric, so that V(r) = V(r) where r = |r]. In this 
approach to scattering, we suppose the potential to be ‘bathed’ in a steady flux of incident 
particles, all of energy E. The wavefunction for the incident beam, far from the region near 
the origin where V is appreciably non-zero, is then just a plane wave of the form @inc = e'**, 
where the z-axis has been chosen along the propagation direction, and where E = Rk’ /2m 
with k = (0,0, k). This plane wave is normalized to one particle per unit volume, and yields 
a steady-state flux of 


Jince = Smi [Pine V Pinc = inc V Pinc] 
mi 
= hk/m=p/m (H.3) 


where the momentum is p = hk. As expected, the incident flux is given by the velocity v 
per unit volume. 

Though we have represented the incident beam as a plane wave, it will, in practice, be 
collimated. We could, of course, superpose such plane waves, with different k’s, to make a 
wave-packet of any desired localization. But the dimensions of practical beams are so much 
greater than the de Broglie wavelength \ = h/p of our particles, that our plane wave will 
be a very good approximation to a realistic packet. 

The form of the complete solution to (H.2), even in the region where V is essentially 
zero, is not simply the incident plane wave, however. The presence of the potential gives 
rise also to a scattered wave, whose form as r — œo is 


eikr 


sc = fO, o) 


(H.4) 


r 


We shall actually derive this later, but its physical interpretation is simply that it is an 
outgoing (~e'*" rather than e~*") ‘spherical wave’, with a factor f (0, ¢) called the scattering 
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amplitude that allows for the fact that even though V(r) is spherically symmetric, the 
solution, in general, will not be (recall the bound-state solutions of the Coulomb potential 
in the hydrogen atom). Calculating the radial component of the flux corresponding to (H.4) 
yields 

h o 


Pc Or 


= Elre, ply (H5) 


Jr,sc 


O 
Psc Psc or Pec 


2mi 


The flux in the two non-radial directions will contain an extra power of r in the 
denominator—recall that 


ð »10 45 1 ə 


vera son” “sind O6 


and so (H.5) represents the correct asymptotic form of the scattered flux. 
The cross section is now easily found. The differential cross section, do, for scattering 
into the element of solid angle dQ is defined by 


do = Jr,sc dS/|Jinc| (H.6) 


where dS = r°? dQ, so that from (H.3) and (H.5) 


do 
— =|f(6,¢)|?. H.7 
T = |f) (H.7) 
The total cross section is then just 
o= | |f(6,0)P aa. (118) 


It is important to realize that the complete asymptotic form of the solution to (H.2) is 
the superposition of inc and ge: 


ikr 
o(r) "Fel + f(O, p) 


(H.9) 


Note that in the ‘forward direction’ (i.e. within a region close to the z-axis, as determined 
by the collimation), the incident and scattered waves will interfere. Careful analysis reveals 
a depletion of the incident beam in the forward direction (the ‘shadow’ of the scattering 
centre), which corresponds exactly to the total flux scattered into all angles (Gottfried 1966, 
section 12.3). This is expressed in the optical theorem: 


Im f(0) = Ea, (H.10) 


H.2 Expression for the scattering amplitude: Born approximation 


We begin by rewriting (H.2) as 


(V? +k’°)olr) = —V(r)¢(r). (H.11) 
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This equation is of exactly the form discussed in appendix G, e.g. equation (G.18) with 
Q, = V? + k’. Further, we know that the Green function for this Q, corresponding to the 
desired outgoing wave solution, is given by (G.14). Using then (G.19) and (G.14), we can 
immediately write the ‘formal solution’ of (H.11) as 


2m Oe ol oa al 


— akr 
on" i h2 4r |r-r'| 


V(r')d(r’) dr’ (H.12) 


where we have chosen ‘u(r)’ in (G.19) to be the incident plane wave inc, and have used 
k- r = kz. We say ‘formal’ because of course the unknown ¢(r’) still appears on the right- 
hand side of (H.12). 

It may therefore seem that we have made no progress—but in fact (H.12) leads to a very 
useful expression for f(0,@), which is the quantity we need to calculate. This can be found 
by considering the asymptotic (r > oo) limit of the integral term in (H.12). We have 


Ir-r] = (r? +r’? -— 2r. r)? 


~ r=r:r'/r+0 (>) terms. (H.13) 


Thus in the exponent we may write 


hpr’ ; pr.r’ ; _ik! ep 
aKT T'I no akl T-T'/r) = ek"e ikr 


where k’ = kê is the outgoing wavevector, pointing along the direction of the outgoing 


scattered wave which enters dS. In the denominator factor we may simply say |r — r’|~! = 
r+ since the next term in (H.13) will produce a correction of order r~?. Putting this 


together, we have 


ikr roy 
(rr) "28° elk? P = I eik T V(r)elr) dr (H.14) 


from which follows the formula for f (0, ¢): 


m 
2rh2 


f0, $) = eik T V(r’) aèr. (H.15) 

No approximations have been made thus far, in deriving (H.15)—but, of course, it still 
involves the unknown ¢(r’) inside the integral. However, it is in a form which is very conve- 
nient for setting up a systematic approximation scheme—a kind of perturbation theory—in 
powers of V. If the potential is relatively ‘weak’, its effect will be such as to produce only a 
slight distortion of the incident wave, and so ¢(r) ~ eiK-T 4 ‘small correction’. This suggests 
that it may be a good approximation to replace ¢(r’) in (H.15) by the undistorted incident 


wave ekr giving the approximate scattering amplitude 


foa (0,4) = ferver (H16) 


2mh? 
where the wave vector transfer q is given by 
q=k-k’. (H.17) 


This is called the ‘Born approximation to the scattering amplitude’. The criteria for the 
validity of the Born approximation are discussed in many standard quantum mechanics 
texts. 
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The approximation can be improved by returning to (H.12) for ¢(r), and replacing 
g(r’) inside the integral by ekr just as we did in (H.16); this will give us a formula for 
the first-order (in V) correction to (r). We can now insert this expression for $(r’) (i.e. 
d(r’) = ekr +O(V) correction) into (H.15), which will give us fga again as the first term, 
but also another term, of order V? (since V appears in the integral in (H.15)). By iterating 
the process indefinitely, the Born series can be set up, to all orders in V. 


H.3 Time-dependent approach 


In this approach we consider the potential V(r) as causing transitions between states de- 
scribing the incident and scattered particles. From standard time-dependent perturbation 
theory in quantum mechanics, the transition probability per unit time for going from state 
|i) to state |f), to first order in V, is given by 


: 2T . 
Pa = EIVI) P(E) aaa, (H.18) 
where p(E;)d£¢ is the number of final states in the energy range dE around the energy- 


conserving point F; = Ep. Equation (H.18) is often known as the ‘Golden Rule’. In the 
present case, if we adopt the same normalization as in the previous section, the initial and 


final states are represented by the wavefunction eT and eTikir so that 
(f|V |i) = fovea =V(q). (H.19) 


Also, the number of such states in a volume element dp’ of momentum space (p' = hk’) is 
d3p'/(27h)?. 

In spherical polar coordinates, with dQ standing for the element of solid angle around 
the direction (0, /¢) of p’, we have 


dp! = p° d|p'|dQ = m|p'| dE’ dQ (H.20) 


where we have used E’ = p’*/2m. It follows that 


d3p’ m 
/ o = 1 1 
p(E") dE’ = rhy? = (anh) |p| dQ dE (H.21) 
and so 
m 


Inserting (H.19) and (H.22) into (H.18) we obtain, for this case, 


: Qn ~ mi 
P. = — 2 ln) dQ. H.2 
n= TIMP aaa (H.23) 


To get the cross section, we need to divide this expression by the incident flux, which is 
|p| /m as in (H.3). Thus the differential cross section for scattering into the element of solid 
angle dQ in the direction (0, @) is 


? 


dpe (= AOG (H.24) 
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Comparing (H.24) with (H.7) and (H.16), we see that this application of the Golden Rule 
(first-order time-dependent perturbation theory) is exactly equivalent to the Born approx- 
imation in the time-independent approach. It is, however, the time-dependent approach 
which is much closer to the corresponding quantum field theory formulation we introduce 
in chapter 6. 


I 


The Schrodinger and Heisenberg Pictures 


The standard introductory formalism of quantum mechanics is that of Schrödinger, in which 


the dynamical variables (such as x and p = —iV) are independent of time, while the 
wavefunction 7) changes with time according to the general equation 

` o t 

ylz, t) = an (1.1) 


where H is the Hamiltonian. Matrix elements of operators A depending on a,p... then 
have the form 


(6ldlu) = f o*(a,)Av(e,1) ae (12) 


and will, in general, depend on time via the time dependences of ¢ and Y. Although used 
almost universally in introductory courses on quantum mechanics, this formulation is not 
the only possible one, nor is it always the most convenient. 

We may, for example, wish to bring out similarities (and differences) between the general 
dynamical frameworks of quantum and classical mechanics. The formulation here does not 
seem to be well adapted to this purpose, since in the classical case the dynamical variables 
depend on time (a(t), p(t)...) and obey equations of motion, while the quantum variables 
A are time-independent and the ‘equation of motion’ (1.1) is for the wavefunction Y, which 
has no classical counterpart. In quantum mechanics, however, it is always possible to make 
unitary transformations of the state vector or wavefunctions. We can make use of this 
possibility to obtain an alternative formulation of quantum mechanics, which is in some 
ways closer to the spirit of classical mechanics, as follows. 

Equation (I.1) can be formally solved to give 


w(x, t) =e w(x, 0) (1.3) 


where the exponential (of an operator!) can be defined by the corresponding power series, 
for example: 


IÊ a 1 
et -1 — ift + ay ( 


It is simple to check that (1.3) as defined by (1.4) does satisfy (I.1) and that the operator 
U = exp(—iHt) is unitary: 


it)? +. (1.4) 


Ut = [exp(—ift)]' = exp(iH't) = expli Ñt) = UT! (1.5) 
where the Hermitian property Ht = Ñ has been used. Thus (I.3) can be viewed as a unitary 


transformation from the time-dependent wavefunction u(x,t) to the time-independent one 
w(a,0). Correspondingly the matrix element (I.2) is then 


(olAlu) = f o (æ, oe ety, 0) da (L6) 
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which can be regarded as the matrix element of the time-dependent operator 


A(t) = et fei (1.7) 


between time-independent wavefunctions ¢*(a,0), w(x, 0). 

Since (1.6) is perfectly general, it is clear that we can calculate amplitudes in quantum 
mechanics in either of the two ways outlined: (a) by using time-dependent ~’s and time- 
independent A’s, which is called the ‘Schrödinger picture’ or (b) by using time-independent 
y’s and time-dependent A’s, which is called the ‘Heisenberg picture’. The wavefunctions 
and operators in the two pictures are related by (1.3) and (1.7). We note that the pictures 
coincide at the (conventionally chosen) time t = 0. 

Since A(t) is now time-dependent, we can ask for its equation of motion. Differentiating 
(I.7) carefully, we find (if A does not depend explicitly on t) that 


A(t n x 
e = —i[A(t), H] (1.8) 
which is called the Heisenberg equation of motion for A(t). On the right-hand side of (1.8), H 
is the Schrédinger operator; however, if H is substituted for A in (1.7), one finds H (t) = H, 
so H can equally well be interpreted as the Heisenberg operator. For simple Hamiltonians 
H , (1.8) leads to operator equations quite analogous to classical equations of motion, which 
can sometimes be solved explicitly (see section 5.2.2 of chapter 5). 
The foregoing ideas apply equally well to the operators and state vectors of quantum 
field theory. 


J 


Dirac Algebra and Trace Identities 


J.1 Dirac algebra 
J.1.1 ~y matrices 


The fundamental anti-commutator 
{yy} = 2g” 
may be used to prove the following results. 


WF = A 
YWA = —2¢ 
Ydby = 4a-b 
uff = —2¢p4 
db = —pd+2a-b. 


As an example, we prove this last result: 


ap 


II 


aby yy” 
= apb =" + 29%”) 
—pd + 2a- b. 


J.1.2 ~5 identities 
Define 


In the usual representation with 


sofi o _(0 o 
7 =(5 al ui v= (4 J 


75 is the matrix 


Either from the definition or using this explicit form, it is easy to prove that 
%=1 


and 
{7s, yy =0 
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oS SSS 
am Rw N 
wi A 


(J.7) 


(J.8) 


(J.9) 


(J.10) 


(J.11) 
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i.e. ys anti-commutes with the other y-matrices. Defining the totally antisymmetric tensor 


—1 for an odd permutation of 0, 1, 2, 3 (J.12) 


+1 for an even permutation of 0, 1, 2, 3 
Euvpo 
0 if two or more indices are the same 
we may write 
i 
5 = gerre Vey”. (J.13) 
With this form it is possible to prove 
i 
ye = zg Envo YY T (J.14) 


and the identity 
YPP = GP — GPA + gP + ig go. (J.15) 


J.1.3 Hermitian conjugate of spinor matrix elements 


[ū(p', s')Pu(p, s)]t = u(p, s)Pu(p’, s") J.16) 
where T is any collection of y matrices and 
T = rte. J.17) 
For example 
ye = q“ J.18) 
and 
Pys = ys. J.19) 


J.1.4 Spin sums and projection operators 
Positive-energy projection operator: 


[A+(P)]as = >) val, s)ūp(p, 8) = (P+ m)ag. (J.20) 


Negative-energy projection operator: 
[A_ (plas = — >> valp, 8)0a(p, 8) = (—p + m)ag. (J.21) 


Note that these forms are specific to the normalizations 


uu = 2m ov = —2m (J.22) 
for the spinors. 
J.2 Trace theorems 
Trl = 4 (theorem 1) (J.23) 
Try = 0 (theorem 2) (J.24) 


Tr(odd number of 7’s) 


l 
= 


(theorem 3) (J.25) 
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Proof 
Consider 
T = Tr(¢,d.-..-4,,) (J.26) 
where n is odd. Now insert 1 = (75)? into T, so that 
T = Tr(d, do --- 4,575): (J.27) 
Move the first 75 to the front of T by repeatedly using the result 
dys = Ws. (J.28) 


We therefore pick up n minus signs: 


T = Tr, ...4,) = C1" Tred, 2,75) 
= (—1)"Tr(¢, ...4,7%575) (cyclic property of trace) 
—Tr(d,---¢,,) for n odd. (J.29) 


Thus, for n odd, T must vanish. 


Tr(dp) = 4a -b (theorem 4). (J.30) 
Proof 
Tr(db) = 5 Tr(db + pd) 
= $4,b,Tr(1.29"”) 
= 4da-b. 
Tr(db¢d) = 4[(a-b)(c-d)+(a-d)(b-c) —(a-c)(b-d)]. (theorem 5) 
(J.31) 
Proof 
Tr(db¢d) = 2(a - b)Tr(¢d) — Tr(Bd¢d) (J.32) 


using the result of (J.6). We continue taking ¢ through the trace in this manner and use 
(J.30) to obtain 


Tr(db¢d) = 2(a- b)4(c- d) — 2(a- c)Tr(Bd) + Tr(Bedd) 
= 8(a- b)(c- d) — 8(a - c)(b - d) + 8(b- c)(a- d) — Tr (pede) (J.33) 
and, since we can bring ¢ to the front of the trace, we have proved the theorem. 
Tr[ys4] = 0. (theorem 6) (J.34) 
This is a special case of theorem 3 since y5 contains four y matrices. 
Tr[y5¢p] = 0. (theorem 7) (J.35) 


This is not so obvious; it may be proved by writing out all the possible products of y 
matrices that arise. 
Tr[y5db¢] = 0. (theorem 8) (J.36) 


Again this is a special case of theorem 3. 


Tr[ysdb¢d] = 4icapysa%b?c7d®’. (theorem 9) (J.37) 
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This theorem follows by looking at components. The e tensor just gives the correct sign of 


the permutation. 
The e tensor is the four-dimensional generalization of the three-dimensional antisym- 
metric tensor €;;;. In the three-dimensional case, we have the well-known results 


(b x Cc); = EijkbjCk (J.38) 


and 
a: (b x c) = EijkaibjCk (J.39) 


for the triple scalar product. 


K 


Example of a Cross Section Calculation 


In this appendix we outline in more detail the calculation of the e~s* elastic scattering 
cross section in section 8.3.2. The standard factors for the unpolarized cross section lead to 
the expression 


i 11 ee. = 

7 2 - Lips(s; k K.1 

de 4Bw|v| 3 2 Me st (5,8 ) d ips(s; iP) ( ) 
1 1 l 

~ 4|(k.p)2 — m2M2]1/2 2 >, |Me-s+ (s, s") dLips(s; k’, p’) (K.2) 


using the result of problem 6.9, and the definition of Lorentz-invariant phase space: 


d3p’ d3 k’ 
(Q7)32E! (27)8 Qu!” 


dLips(s; k’, p’) = (27)*64*(k’ +p' — k — p) (K.3) 
Instead of evaluating the matrix element and phase space integral in the CM frame, or writ- 
ing the result in invariant form, we shall perform the calculation entirely in the ‘laboratory’ 
frame, defined as the frame in which the target (i.e. the s-particle) is at rest: 


p” = (M,0) (K.4) 


where M is the s-particle mass. Let us look in some detail at the ‘laboratory’ frame kine- 
matics for elastic scattering (figure K.1). Conservation of energy and momentum in the 
form 


p’ =(p+q) (K.5) 


allows us to eliminate p’ to obtain the elastic scattering condition 


2p-q+¢ =0 (K.6) 
or 
2p-q= Q? (K.7) 
if we introduce the positive quantity 
Q = -4° (K.8) 


for a scattering process. 
In all the applications with which we are concerned, it will be a good approximation to 
neglect electron mass effects for high-energy electrons. We therefore set 


k? =k? ~0 (K.9) 


so that 
s+t+ur2M’ (K.10) 
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FIGURE K.1 
Laboratory frame kinematics. 


where 
s = (k+p)?=(k' +p’) (K.11) 
t = (k-k)=(p'-pPr =g (K.12) 
u = (k—p')? = (k' - p) (K.13) 


are the usual Mandelstam variables. For the electron 4-vectors 


k” = (w,k) (K.14) 
kË = (w, k’) (K.15) 


we can neglect the difference between the magnitude of the 3-momentum and the energy, 


w ~ |k|=k K.16) 
Ww œ~ |k'|=k’ K.17) 
and in this approximation 
q? = —2kk' (1 — cos 0) K.18) 
or 
q? = —4kk' sin? (0/2). K.19) 


The elastic scattering condition (K.7) gives the following relation between k, k’, and 6: 
(k/k’) = 1 + (2k/M) sin? (0/2). (K.20) 


It is important to realize that this relation is only true for elastic scattering: for inclusive 
inelastic electron scattering k, k’, and @ are independent variables. 
The first element of the cross section, the flux factor, is easy to evaluate: 


Al(k +p)? — m? M?]? ~ 4Mk (K.21) 


in the approximation of neglecting the electron mass m. We now consider the calculation 
of the spin-averaged matrix element and the phase space integral in turn. 


K.1 The spin-averaged squared matrix element 


The Feynman rules for es scattering enable us to write the spin sum in the form 


1 4ra? 
E Mest = (FE) tm (K.22) 


s,s! 
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where Ly, is the lepton tensor, T#” the s-particle tensor, and the one-photon exchange 
approximation has been assumed. From problem 8.12 we find the result 


LyyT = 8[2(k - p)(k' - p) + (° /2)M?]. (K.23) 
In the ‘laboratory’ frame, neglecting the electron mass, this becomes 


Luu T”” = 16M°kk' cos? (0/2). (K.24) 


K.2 Evaluation of two-body Lorentz-invariant phase space in 
‘laboratory’ variables 


We must evaluate 


1 
(ny? 


dp’ d3 k! 
E! w 


dLips(s; k’, p') = St(k! +p’ — k -— p) (K.25) 


in terms of ‘laboratory’ variables. This is in fact rather tricky and requires some care. There 
are several ways it can be done: 


(a) Use CM variables, put the cross section into invariant form, and then translate 
to the ‘laboratory’ frame. This involves relating dg? to d(cos @) which we shall do 
as an exercise at the end of this appendix. 


(b) Alternatively, we can work directly in terms of ‘laboratory’ variables and write 
3p! /2E!’ = d*p! 6(p'” — M?)6(p"). (K.26) 


The four-dimensional 6-function then removes the integration over dtp’ leaving 
us only with an integration over the single 6-function 6 (p°? — M?), in which p’ is 
understood to be replaced by k + p — k’. For details of this last integration, see 
Bjorken and Drell (1964, p 114). 


(c) We shall evaluate the phase space integral in a more direct manner. We begin 
by performing the integral over d°p’ using the three-dimensional 5-function from 
S (k! +p' — k — p). In the ‘laboratory’ frame p = 0, so we have 


[ee O° (k! ar p = k) f(p', k', k) = f(r’, k', k)|p-k-k” (K.27) 


In the particular function f(p’,k’,k) that we require, p’ only appears via E’, since 
E? =p” + M? (K.28) 


and 
p =k? + k’? — 2kk' cos 0 (K.29) 


(setting the electron mass m to zero). We now change dk’ to angular variables: 


Pk! Jw" ~ k'dk'dQ (K.30) 
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leading to 


1 k' 
Lips(s; k’, p') = —~> dQ dk’ 
dLips(s; k’, p’) (4r)? E 
Since E’ is a function of k’ and @ for a given k (cf (K.28) and (K.29)), the 6-function relates 
k’ and 0 as required for elastic scattering (cf (K.20)), but until the 6 function integration 
is performed they must be regarded as independent variables. We have the integral 


6(E’ +k! —k—M). (K.31) 


1 


ape | aan’ IUW, cos0) (K.32) 


where , 
f(k’, cos 0) = [(k? + k’? — 2kk' cos 0) + M°]? +k’ — k — M (K.33) 


remaining to be evaluated. In order to obtain a differential cross section, we wish to integrate 
over k’; for this k’ integration we must regard cos 8 in f(k’,cos@) as a constant, and use the 


result (E.36): 
1 


6(f(x)) = Feka — zo) (K.34) 
where f(xo) = 0. The required derivative is 
of = ale + k’ — kcos@) (K.35) 


constant cos 0 


and the 6-function requires that k’ is determined from k and @ by the elastic scattering 


condition 
ki = = k'(cos 6). K.36 
1+ (2k/M) sin? (0/2) ale ve) 


The integral (K.32) becomes 


1 J k' 1 
— | ddk’ 
(4r)? E” Id f /dk'|k'=k' (cos 0) 


and, after some juggling, df/dk’ evaluated at k’ = k’(cos@) may be written as 


ar 


|k’ — k' (cos 6)| (K.37) 


_ Mk 


= : (K.38) 
dk’ k’=k' (cos 0) E'k' 
Thus we obtain finally the result 
1 k’ 
Li kp) = —~ — da K. 
dLips(s; k’, p”) (amp Vie (K.39) 


for two-body elastic scattering in terms of ‘laboratory’ variables, neglecting lepton masses. 
Putting all these elements together yields the advertised result 


do da a? Re. og 
a 6/2). K.40 
Ga dQ 4k? sin2(0/2) k ey?) ue) 


As a final twist to this calculation, let us consider the change of variables from dQ to dq? 
in this elastic scattering example. In the unpolarized case 


dQ = 27d (cos 0) (K.41) 
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and 
q? = —2kk' (1 — cos 0) 


where 
k 


~ 1+ (2k/M) sin? (0/2) 


Thus, since k’ and cos @ are not independent variables, we have 


1 


dk’ 
a= Dkk 0)+(1 2k 
dq d(cos 0) + (1 — cos 0)( I eos 0) d(cos 0) 
From (K.20) we find 
dk’ a 
d(cos 6) M 


and, after some routine juggling, arrive at the result 


dq? = 2k!” d(cos 0). 


If we introduce the variable v defined, for elastic scattering, by 
2p -q = 2Mv = —q° 


we have immediately 


12 
dv = it d(cos 0). 

Similarly, if we introduce the variable y defined by 

y=v/k 
we find 

k? 
= Q 
dy = aM S 


for elastic scattering. 
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(K.42) 


(K.43) 


(K.44) 


(K.45) 


(K.46) 


(K.47) 


(K.48) 


(K.49) 


(K.50) 


L 


Feynman Rules for Tree Graphs in QED 


2 — 2 cross section formula 


1 
d = 2dLi , . 
á 4[(p1 - p2)? — m?m2]!/2 |M] ips(s; p3, pa) 


1 — 2 decay formula 


dr = Sa MPaLips(mnts pa, pa) 
Note that for two identical particles in the final state an extra factor of $ must be included 
in these formulae. 

The amplitude iM is the invariant matrix element for the process under considera- 
tion, and is given by the Feynman rules of the relevant theory. For particles with non-zero 
spin, unpolarized cross sections are formed by averaging over initial spin components and 
summing over final. 


L.1 External particles 
Spin-5 
For each fermion or anti-fermion line entering the graph include the spinor 
u(p, s) or u(p, s) (L.1) 
and for spin-4 particles leaving the graph the spinor 
u(p’,s’) or lp, s’). (L.2) 
Photons 
For each photon line entering the graph include a polarization vector 
E(k, A) (L.3) 
and for photons leaving the graph the vector 


et (k', N). (L.4) 
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L.2 Propagators 


Spin-0 
i 
EEE E L.5 
~ p — m? + ie ne) 
Spin-4 
+ 
oe ee E (L.6) 
p-m p-m? +ie 
Photon 
= : Bw L(Y L.7 
WIN IE hl") aaa (L.7) 


for a general €. Calculations are usually performed in the Lorentz or Feynman gauge with 
€ = 1 and photon propagator equal to 


(=g) (L.8) 


ee eS 
L.3 Vertices 
Spin-0 
pe ore 
—ie(p+p’), (for charge +e) 
Yi 
aan 
p” ET 
Bgu 
Spin-5 
p p 


—iey, (for charge +e) 
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Feynman rules 
for ABC theory, 142, 148 
for loops, 148, 250 
for QED, 197-198, 207, 346-347 
Field, electromagnetic, 11 
quantization of, 165-169 
Field strength renormalization, 255-257, 
270, 273, 276-277 
constant, 256 
Field theory, classical 
Lagrange-Hamilton approach, 111-114 
Field theory, quantum, see 
Quantum field theory 
Fine structure constant, 183, 247 
g’-dependent, 280-283 
Flavour 
lepton, 5—6 
quark, 9 
Flux factor, 145, 224, 330-331, 333 
for virtual photon, 229 
Form factor, electromagnetic, 202 
of nucleon, 215-216 
and invariance arguments, 214-215 
Dirac charge, 215 
electric, 216 
magnetic, 216 
Pauli anomalous magnetic moment, 
215 
q?-dependence, 216 
radiatively induced, 279-280 
of pion, 200-207 
and invariance arguments, 202-204 
in the time-like region, 204-207, 
238-239 
static, 202 
Form invariance, see Covariance 
Fourier series, 314-316 
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4-momentum conservation, 133, 192 

4-vector, 308-311 

Four-vector potential, electromagnetic, 
37-38, 162-169 


Galilean transformation, 92 
y matrices, 77, 91, 188, 337-338 
anticommutation relations, 337 
trace theorems, 338-340 
ys matrix, 81, 337-338 
g factor, 66-67 
prediction of g = 2 from Dirac equation, 
66-68 
QED corrections to, 68, 287—291 
Gauge 
bosons, of SM, 24 
choice of, 163 
and photon propagator, 169 
covariance, in quantum mechanics, 
39—42 
covariant derivative, 42 
field, 44 
invariance, 18, 34-42, 163, 196 
and charge conservation, 38, 170-171 
of QED, 169, 196 
and masslessness of photon, 18, 21, 
2ri-278 
and Maxwell equations, 36-39 
and photon polarization states, 
163-164 
and Schrodinger current, 41 
and Ward identity, 208-209 
as dynamical principle, 42—49 
in classical electromagnetism, 
36-39 
in Compton scattering, 208, 219 
in quantum mechanics, 39-42 
parameter, 169 
physical results independent of, 169 
principle, 42—49, 170 
theories, 19-21, 29, 33-39 
transformation, 26-38 
and quantum mechanics, 39-42 
Gauss’s divergence theorem, 325 
Gauss’s law, 34, 306, 325 
General relativity, 39 
Generations, 4, 8 
and anomalies, 9 
Ginzburg-Landau theory, 26 
Glashow-Iliopoulos-Maiani (GIM) 
mechanism, 8 
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Glashow-Salam-Weinberg (GSW) theory, 3, 
12, 21-22, 24, 26 
renormalizability of, 25-27 
Gluon, 21-24, 233 
Gluons, and momentum sum rule, 233 
Golden Rule, 334 
Goldstone quantum, 278 
Gordon decomposition of current, 215, 221 
Gravity, 4 
Green function, 138, 149-150, 325-329 
Group, 46 
U(1), 46 
Gupta-Bleuler formalism, 168 


Hadron, 6 
Hamiltonian, 106-117, 122, 124, 154, 160, 
168, 194 
classical, 106 
density, 113, 115, 160 
Dirac, 160, 171 
for charged particle in electromagnetic 
field, 39 
Klein-Gordon, 121 
Maxwell, 168 
operator, 108 
string, 114 
Hamilton’s equations, 107 
Hand cross section for virtual photons, 229, 
243 
Harmonic approximation, 97—98, 100, 124 
Heaviside-Lorentz units, 306-307 
Heisenberg 
equation of motion, 108, 118, 122-123, 
127, 236 
picture (formulation) of quantum 
mechanics, 103-108, 127-128, 336 
Helicity, 60 
conservation, 190, 243 
HERA, 244 
Higgs 
boson, 26-28 
mass, 28 
spin, 28 
coupling constant, 283 
field, 26-27, 122 
and renormalizability of GSW theory, 
26-27 
mechanism, 278 
sector, 283 
Hofstadter experiments, 7 
Hole theory, 62-63 
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Inelastic scattering, see 
Scattering 
e` -proton inelastic 
Interaction picture, 127-129 
Interactions 
electromagnetic, 17-18 
introduction via the gauge principle, 
170-173 
of spin-0 particles, 183-187 
of spin-4 particles, 187—194 
in quantum field theory 
qualitative description of, 124-126 
Interference terms, in quantum mechanics, 
43 
Interquark potential, 22 
Invariant amplitude, 133, 142, 195, 198, 206, 
208, 211, 284 
Invariance 
and dynamical theories, 33-34 
global, 33-34, 38, 49 
local, 33-34, 38, 49 
phase, 42—44 
Lorentz, 308-311 


Jets, 21-23, 240 
J/w 8, 10; see also Charmonium 


K flux factor, 229 
Klein-Gordon equation, 51-53 
and C, 82-83 
and P, 79 
and T, 87 
derivation, 51-52 
free-particle solutions, 52 
normalization of, 185 
first-order perturbation theory for, 183 
negative-energy solutions, 52, 63-65 
negative probabilities, 53 
potential, 66, 70, 183 
probability current density, 52-53, 65, 
68 
probability density, 52-53, 65, 68 
Klein-Gordon field, 120-121, 152-157 


Lagrangian, 104-107 
ABC, 258, 263 
classical field mechanics, 111-112 
density, 111 
Dirac, 158 
Klein-Gordon, 120 
Maxwell, 162, 166, 178 
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particle mechanics, 104—106 
quantum field dynamics, 114-119 
QED, 270 
Schrodinger, 122 
string, 113, 120 
Lamb shift, 279 
Uehling contribution, 279 
Landau gauge, 169 
Least action, Hamilton’s principle of, 104— 
106 
Lenz’s law, 35 
Lepton, 4-6 
flavour, 5-6 
universality, 4, 27 
Lepton quantum numbers, 5-6 
Lepton tensor, see 
Tensor 
lepton, 
Leptoquark, 244 
Linear superposition, 97-98, 126 
Loop diagrams, 147-148, 247—248 
and divergences, 148, 250-251 
and renormalization, 148, 247—297 
in ABC theory, 247-269 
in QED, 270-297 
and unitarity, 284 
closed fermion, 273 
Loop momenta, 147 
Lorentz 
condition, 163, 166, 168, 196 
covariance, 37 
force law, 39 
gauge, 169, 203, 208 
invariance, 308-311 
and form factors, 202-203, 214 
and inelastic hadron tensor, 223 
-invariant phase space (Lips), 145, 341, 
343 
in CM frame, 146 
in ‘laboratory’ frame, 343-344 
transformations, 308 
and Dirac equation, 74-78 
and KG equation, 72-73, 90 


u decay, 6 
Magnetic moment 
anomalous, 287-291 
and renormalizability, 288 
of electron, 65-68, 288-290 
of muon, 290-291 
orbital, 302 
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Majorana 
fermion, 84 
field, 175 
mass term, 175 
spinor, 84 
Mandelstam 
s variable, 143, 209, 342 
t variable, 143, 342 
u variable, 143, 250, 342 
Mass 
effective, 254 
physical, 254, 258, 265 
running, 27 
shift, 253-254 
Mass-shell condition, 143, 264-265 
Massless spin-5 particle, wave equation for, 
243 
Massless vector field, wave equation for, 163 
Maxwell field, 162-169 
Maxwell’s equations, 11, 33, 34-38 
and Lagrangian field theory, 162-164 
and units, 306-307 
gauge invariance of, 34, 36-38 
Lorentz covariance of, 36-38, 308-311 
Meissner effect, 26 
Meson spectroscopy, 7 
Metric tensor, 309 
MKS units, 304, 306-307 
Mode, 99-104 
frequency, 95-103 
normal, 99-103 
coordinates, 99-100, 102 
expansion, 102, 113, 116, 121, 124, 127, 
158, 166-167 
interacting, 124 
oscillator, 100-101, 103 
quanta, 101 
superposition, 99 
operators, 116 
time-like, 167-168 
Momentum, generalized 
canonically conjugate, 113, 165 
Momentum sum rule, 233 
Mott cross section, 190 
Muon, 4 
Muonic atoms, 279 


Natural units, 304-305 
Negative-energy solutions 

Dirac’s interpretation, 62-63 
Feynman’s interpretation, 63-65 
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Neutral current process, 19 
Neutrino, 

e-type and p-type, 5 

mass, 6 

mixing, 6 

oscillations, 6, 31 
Newton’s constant (of gravity), 31, 297 
Noether’s theorem, 153, 160 
Non-Abelian gauge theories, 46 
and asymptotic freedom, 283 
Non-relativistic quantum mechanics, revi- 
sion, 301-303 
Non-renormalizable 
term, 295, 297 
theory, 25, 281-292, 296 
Normalization 

box, 133, 301 

covariant, 185, 188 

of states, 132, 185, 189 
Normal ordering, 117, 154, 160-161, 187, 

194 


O(2) transformation, 152, 154 
Off-mass-shell, 143 

One-photon exchange approximation, 222 
One-quantum exchange process, 13-17 
Operator product expansion, 234 

Optical theorem, 284, 331 

Oscillator, quantum, 109-111 


Pair creation, 63, 281 
Parity 
invariance, in electromagnetic interac- 
tions, 81, 174 
operator P, 80 
and KG equation, 79 
and Dirac equation, 79-80 
eigenvalues, 80 
in qft, 173-124 
transformation, P, 79 
and Dirac equation, 79-80 
and KG equation, 79 
intrinsic, 81 
opposite, for particle and antiparticle, 
80, 174 
violation, in weak interactions, 82 


Parton, 225-233 
and Breit (brick-wall) frame, 230 
and quarks and gluons, 231-233 
distribution function, 228, 231-233 
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model, 225-226 
and Drell-Yan process, 234-237 
sea, 233 
valence, 233 
Path integral formalism, 108-109, 165, 168 
Pauli 
exclusion principle, 104 
matrices, 55, 69 
Perturbation theory 
in interaction picture, 127-131 
in non-relativistic quantum mechanics 
(NRQM), 125, 256-257 
time-dependent, 182, 302-303 
in quantum field theory 
bare, 247—262, 267 
renormalized, 262-269 
Phase 
factor, non-integrable, 48 
invariance, 18 
global, 43 
local, 33-34, 43—44 
space, two-particle, 134 
evaluated in CM frame, 145-146 
evaluated in ‘laboratory’ frame, 
343-345 
Lorentz invariant, 134, 145, 343-344 
transformation, space-time dependent, 
42 
Phonon, 12, 101, 126 
Photon 
absorption and emission of, 303 
as excitation quantum of 
electromagnetic field, 104 
external, 207 
masslessness of, 18, 21, 24, 26 
and polarization states, 24, 163-164 
propagator, 168-169 
and gauge choice, 169 
virtual, 198, 229, 242 
Pion 
Compton wavelength, 304 
form factor, 203-204 
weak decay, 18 
Planck 
scale, 31 
Point-like interaction, 15, 20 
Poisson’s equation, 13, 201, 325, 327 
Polarization 
circular, 164 
hadronic vacuum, 289-291 
linear, 164 


364 


Pole 


of charge in dielectric, 280-281 

of vacuum, 281-282 

states 
for massive spin-1 bosons, 24, 26, 164 
for photons, 24, 163-164 
longitudinal, 167-168 
pseudo-completeness relation, 

169, 209 

time-like (scalar), 167-168 
transverse, 165, 207 

sum, for photons, 209 

vectors, for photons, 163-169 


in propagator. 143 
in complex plane, 260 
simple, 322-323 


Positron, prediction and discovery of, 63 
Positronium, 30 
Probability current 


for Dirac equation, 58, 65, 76-77 
4-vector character, 53, 58, 77 

for KG equation, 53, 64, 68 

for Schrödinger equation, 68, 301 


Probability distribution functions 


for partons, 228 
for quarks, 231-234 


Projection operators, 243, 338 
Propagator, 16 


complete, in ABC theory, 253 

in external line, 267 
for complex scalar field, 156-157 
for Dirac field, 161 
for photon, 165-169 

in arbitrary gauge, 169 
for scalar field, 137-142 
renormalized, in ABC theory, 267 


Pseudoscalar (under P), 81 
Psi meson (w/J particle), 9; see also 


Charmonium 


Quantum electrodynamics (QED), 3, 11, 18, 


20-22, 25-26, 29, 33, 173, 247, 258, 
262, 270-297 

introduction, 170-173 

renormalizability of, 26, 49, 287-288, 
295 

scalar, 172-178 

spinor, 170-172 

tests of, 287-291 


Quanta, 101, 111, 117-118 
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Quantum chromodynamics (QCD) 3, 7-10, 


13, 22-23, 283 
and asymptotic freedom, 10, 22, 283 
lattice, 23 


renormalizability of, 27 
Quantum field theory 
antiparticles in, 151-157, 160-161 
complex scalar field, 152-157 
Dirac field, 158-162 
fundamental commutator, 114, 159 
interacting scalar fields, 124-150 
internal symmetries in, 152-155, 160- 
161 
Klein-Gordon field, 120-121, 152-157 
Lagrange-Hamilton formulation, 114- 
119 
Maxwell field, 165-169 
perturbation theory for, 126-131 
qualitative description, 11-13, 96-104, 
124-126 
real scalar field, 114-121 
Quark, 3, 6-10 
as hadronic constituent, 6-10 
charges, 7 
charm, 8 
colour, 8—9 
flavour, 9 
masses, 9-10 
model potential, 22, 30 
parton model, 231-234, 240 
sum rules, 232-233 
probability distribution functions, 
231-234 
quantum numbers, 8-9 
sea, 233 
valence, 233 
Quarks, confinement of, see Confinement 


p-dominance of pion form factor, 238-240 
R (ete annihilation ratio), 239-240 
Regularization, 253, 261, 273 
cut-off method, 261-262 
dimensional, 275 
Relativity 
general, 39 
special, 308-311 
Renormalizability, 25-27, 267-269 
and gauge invariance, 26, 49, 297 
as criterion for physical theory, 291-297 
criteria for, 135, 291-297 
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Renormalization, 25-27, 135, 262-267, 
270-278 
and Higgs sector of SM, 25-27 
conditions, 265-267 
constant, 256, 264 
field strength, 255-257 
group, 27, 282 
mass, 253-254 
of QED, 270-278 
Resonance width, 144 
Rosenbluth cross section, 216 
Running coupling constant, see 
Coupling constant 
running 
Rutherford 
scattering, 17, 21, 185, 200 
from charge distribution, 201-202 


o (Pauli) matrices, 55, 58, 69, 302 
Sakharov conditions, 86 
s-channel, 143, 206-207, 247 
Scalar field, 96 
Scalar potential, 36 
Scalar (under P), 81 
Scaling, see also Bjorken scaling 
in Drell-Yan process, 234-237, 244 
and operator product expansions, 226 
variables, 225-227 
violations, 225 
Scattering 
amplitude, 330-332 
as exchange process, 13-17, 142 
Compton, of electron, 207—210 
Coulomb 
of charged spin-4 particles, 187—194 
of charged spinless particles, 183-187 
ed, 227 
e7, from charge distribution, 201-202 
eu, 210-213 
lowest order, in ‘laboratory frame’, 
212 
e mt, elastic, 200-204 
e` -parton, 226 
e` -proton 
Bjorken scaling in, 225-229 
elastic, 7, 212-216 
inelastic, 222-242 
kinematics, 216-217 
structure functions, 223-230 
est, 194-200 
ete” — pty, 212-213, 220-221, 235 
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ete” — at, 204-207 
qd > uj, 235 
quasi-elastic, 227 
Rutherford, see Rutherford scattering 
theory, 
non-relativistic, 330-334 
time-dependent, 333-334 
time-independent, 330-333 
Schrödinger equation for spinless particles, 
39-41, 47—48, 51, 301 
and Galilean transformation, 92 
free-particle solutions, 301 
interaction with electromagnetic field, 
39-49, 302 
probability current density, 68, 301 
probability density, 68, 301 
Schrödinger picture (formulation), 107, 127- 
129, 335-336 
Sea, of negative-energy states, 62-63 
Second quantization, 119 
Self-energy 
fermion, in QED, 272-273 
in ABC theory, 248-253, 259-261 
renormalized, 206 
one-particle irreducible, 253 
photon, in QED, 273-278 
imaginary part of, 284-285 
renormalized, 275-285 
Singularity, 143, 322 
Slash notation, 91, 161, 188 
S-matrix, 169, 258, 263, 276, 284 
Lorentz invariance of, 157 
unitarity of, 130 
S-operator, 131 
Dyson expansion of, 131 
Lorentz invariance of, 138 
Special relativity, 308-311 
Spin matrices, 302 
Spin-statistics connection, 158-162 
Spin sums and projection operators, 338 
Spinor 55 
and rotations, 74-77 
and velocity transformations (boosts), 
77-78 
conjugate, 91 
four-component, 57 
negative-energy, 61-68, 91 
positive-energy, 60-61, 91 
rest-frame, 59 
self-conjugate (Majorana), 84, 175 
two-component, 56, 58 
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Spontaneously broken symmetry, 10, 12, 26 
Standard Model, 3-4, 6, 33, 38, 49, 96, 121, 
148, 247, 251, 283, 289-291 
Stokes’ theorem, 320 
Strangeness, 9 
conservation, 9, 29 
Strange quark, 9 
String theory, 4 
Strong interactions, 21—23 
Structure function, 222-223, 225-228 
and positivity properties, 230 
of proton, electromagnetic, 222-223 
scaling of, 225-231 
Subtraction, 266, 294 
Sum rules, see Quark parton model 
Summation convention, 308 
Supersymmetry, 117 
Super-renormalizable theory, 135, 262, 269, 
292 
Symmetry 
current, 153, 160, 171 
internal, 46 
operator, 152-154, 160 


t-channel, 206, 283 
Tau lepton, 4-6 
and neutrino, 6 
Tensor, 309 
antisymmetric 
4-D, 214, 338 
3-D, 214, 340 
boson, 199 
electromagnetic field strength, 37-38, 
50, 309 
hadron, in inelastic e~ p scattering, 
222-224 
lepton, 191, 199, 211, 222 
metric, 309 
proton, 214 
Theta function, 139-140, 323-324 
Time-ordering symbol, 131, 149 
and fermions, 161 
and Feynman graphs, 138-139. 156 
and Lorentz invariance, 137-138 
Time-reversal, 86-90 
in qft, 176-177 
invariance, in electromagnetic 
interactions, 177 
operator T, 88, 176-177 
and Dirac equation, 88 
and KG equation, 88 
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not unitary, 88, 176 
transformation T, 86 
and Dirac equation, 87—90 
and KG equation, 87 
violation, in weak interactions, 89 
Tomonaga-Schwinger equation, 129 
Top quark, 8 
Trace techniques, for spin summations, 
191-193 
Trace theorems, 192-193, 338-340 
Transformation 
gauge, in electromagnetic theory, 
36-44 
and dynamics, 28, 171 
global, 33-34, 38, 43 
local, 33-34, 38, 43, 45 
Lorentz, see Lorentz transformations 
O(2), 152 
Tree diagrams, 247 


u-channel, 143, 207, 247-248 
u-variable, 143, 250 
U(1) 
group, 46 
phase invariance, 46 
global, 151-155, 160, 170-172 
local, 46, 151, 170-172 
Uehling effect, 279 
Unification, 18-21 
Unitarity, 130, 284 
Units 
Gaussian CGS, 306-307 
rationalized, 306-307 
natural, 304-305 
Universality, 4, 27, 45, 287, 297 
and renormalization, 287 
lepton, 4, 27 
of electromagnetic interaction, 44 
of gauge field interaction, 44, 287 
Upsilon meson, 8, 10 


Vacuum, 12, 127, 148, 155, 158, 167, 254, 
281 
and Dirac sea, 62 
and field system ground state, 118-119 
and many-body ground state, 12, 26 
and symmetry-breaking, 26-27 
polarization, 281—282, 287 
quantum fluctuations in, 254, 258 
Vacuum expectation values, 135-137 
Vector potential, 36 
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Vertex 
ABC theory, 141 
correction 
in ABC theory, 258-259 
in QED, 285-287 
pion electromagnetic, 203-204 
proton electromagnetic, 213-216 
Vibrating string, 101-103 
energy of, 103 
modes of, 102 
Virtual Compton process, 210, 219 
Virtual photon, 143-144, 229-231, 242-243 
Virtual quantum, 143 
Virtual transitions, 147, 254 


W boson, 6, 18-22, 24-28, 63, 157 
polarization states, 24, 164 
Ward identity, 208-209, 271, 286 
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Wavefunction 
and quantum field, 119 
phase of, 43—44, 151 
Wavelength, Compton, of electron, 281 
Wave-particle duality, 11, 118 
Weak interaction, 18-21 
range, 18-19 
Wick’s theorem, 137 


Yang-Mills theory, 38-39 

Yukawa interaction, 13-15, 25-26, 127, 142, 
296 

Yukawa potential, 13, 227, 330 

Yukawa-Wick argument, 15 


Zı = Zə in QED, 271, 286-287 

Z? boson, 6, 19, 22, 24, 26, 28, 220, 283 
polarization states, 24, 164 

Zero-point energy, 110, 117 


